The Refinement Pipeline is an iterative quality improvement system that automatically enhances normalized claims through AI-powered feedback and self-correction. It uses DeepEval metrics to evaluate claims and refinement algorithms to improve them until they meet quality thresholds.
The refinement service is located in api/services/refinement/refine.py and integrates with DeepEval’s G-Eval metrics for quality assessment.
If the score meets the threshold, return the original claim
if original_score >= self.threshold: return current_response, refinement_history
3
Iterative Refinement
Generate feedback and refine the claim up to max_iters times
for i in range(self.max_iters): refine_user_prompt = f""" ## Original Query {original_query} ## Current Response {current_claim} ## Feedback {eval_result.test_results[0].metrics_data[0].reason} ## Task Refine the current response based on the feedback to improve its accuracy, verifiability, and overall quality. """ refined_response = client.generate_response( user_prompt=refine_user_prompt, sys_prompt=self.refine_sys_prompt )
4
Re-evaluation
Evaluate the refined claim and check if it meets the threshold
Cross-refine uses feedback from a different model to provide diverse perspectives:
# From api/_utils/prompts.py:217-230feedback_prompt = """You are provided with a generated response and a user prompt.Your task is to provide detailed, constructive feedback based on the criteria provided.Please score the response on the following criteria using a 0-10 scale:1. **Verifiability**2. **Likelihood of Being False**3. **Public Interest**4. **Potential Harm**5. **Check-Worthiness**For each criterion, provide:- A score (0-10)- Provide a short, precise justification in 1 sentence."""
The default evaluation criteria from api/types/evals.py:25-50:
View Complete Evaluation Criteria
STATIC_EVAL_SPECS = StaticEvaluation( criteria="""Evaluate the normalized claim against the following criteria: Verifiability and Self-Containment, Claim Centrality and Extraction Quality, Conciseness and Clarity, Check-Worthiness Alignment, and Factual Consistency""", evaluation_steps=[ # Verifiability and Self-Containment "Check if the claim contains verifiable factual assertions ", "Check if the claim is self-contained without requiring additional context", # Claim Centrality and Extraction Quality "Check if the normalized claim captures the central assertion", "Check if the claim represents the core factual assertion", # Conciseness and Clarity "Check if the claim is presented in a straightforward, concise manner", "Check if the claim is significantly shorter than source posts", # Check-Worthiness Alignment "Check if the normalized claim meets check-worthiness standards", "Check if the claim has general public interest, potential for harm", # Factual Consistency "Check if the normalized claim is factually consistent with the source", "Check if the claim accurately reflects the original assertion", ])
The refinement pipeline gracefully handles failures:
From api/services/refinement/refine.py:172-185
except Exception as e: logger.warning(f"Failed to refine claim: {e}") # Return original response with error in history error_history = RefinementHistory( claim_type=ClaimType.FINAL, claim=current_claim, score=0.0, feedback=f"Refinement failed: {str(e)}" ) return current_response or original_response, refinement_history
If refinement fails, the API returns the original claim with error details in the refinement_history. Your application will never receive an error response due to refinement issues.