Agentic RAG with multi-step iterative retrieval, reflection, and routing
Agentic RAG introduces iterative decision-making into the retrieval pipeline. Instead of a single retrieve-then-generate pass, the system evaluates its candidate answer and decides whether to retrieve more evidence, reflect on the answer, or finalize — repeating until quality is acceptable or the iteration limit is reached.
The pipeline uses AgenticRouter (from components/agentic_router.py) as the central decision-maker. The router is a ChatGroq LLM that returns structured JSON decisions at each step.
has_documents: whether documents have already been retrieved
current_answer: the current draft answer (if any)
iteration: current iteration number (1-indexed)
max_iterations: the hard loop limit
It returns {"action": "search|reflect|generate", "reasoning": "..."}.When iteration >= max_iterations, the router automatically returns "generate" regardless of quality — this is the safety mechanism that prevents infinite loops.
Called when documents exist but answer confidence is uncertain. The router sends the query, current answer, and context back to the LLM for gap identification and correction.
Start with 2; increase only if quality does not converge at lower values.Without a hard loop cap, the router can cycle indefinitely. Always set a finite limit.
The AgenticRouter.ROUTING_TEMPLATE is a structured prompt that includes the current query, has_documents flag, current_answer, and iteration count. The LLM must return a JSON object with "action" and "reasoning" keys.
ROUTING_TEMPLATE = """You are a query routing agent. Given a query and optional current answer, decide what action to take next.Current State:- Query: {query}- Has Retrieved Documents: {has_documents}- Current Answer: {current_answer}- Iteration: {iteration}/{max_iterations}Your task is to decide ONE of the following actions:1. 'search': Retrieve documents from vector database (choose this if you need more information)2. 'reflect': Verify and improve the current answer (choose this to validate answer quality)3. 'generate': Create final answer (choose this when you have enough information)Return a JSON object with this exact format:{{"action": "search|reflect|generate", "reasoning": "brief explanation"}}Do NOT include any other text. Return ONLY the JSON object."""
Invalid JSON or unrecognized action values raise ValueError for fast debugging.
from vectordb.langchain.agentic_rag import AgenticRAGPipelinepipeline = AgenticRAGPipeline(config)result = pipeline.run("What are the three laws of thermodynamics?")print(result["history"])# [# {"iteration": 1, "action": "search", "reasoning": "No documents retrieved yet"},# {"iteration": 2, "action": "reflect", "reasoning": "Need to verify completeness"},# {"iteration": 3, "action": "generate", "reasoning": "Sufficient information gathered"}# ]print(result["answer"])# "The three laws of thermodynamics are: 1) Energy cannot be created or destroyed..."
No hard loop cap: Without max_iterations, the loop can cycle indefinitely. Always set a finite limit.
Ambiguous routing prompts: If the prompt does not clearly define when to choose "search" vs "reflect" vs "generate", the router oscillates without making progress.
Missing observability: Log every routing decision, action, and reasoning string. The AgenticRouter logs at INFO level for decisions and DEBUG level for full prompts.