AgenticRouter
Route and orchestrate RAG with agent-like behavior for tool selection and self-reflection.Constructor
LLM model name for routing decisions
API key for the LLM provider. Falls back to GROQ_API_KEY environment variable
Base URL for OpenAI-compatible API endpoint
Methods
select_tool
Select the best tool for a given query.User query to route
Selected tool name: “retrieval”, “web_search”, “calculation”, or “reasoning”
evaluate_answer_quality
Evaluate generated answer quality with relevance, completeness, and grounding scores.Original user query
Generated answer to evaluate
Retrieved context used to generate the answer (for grounding check)
Dictionary containing:
- relevance (0-100): Does it answer the query?
- completeness (0-100): Is it sufficiently detailed?
- grounding (0-100): Is it grounded in the context?
- issues (list): List of identified problems (max 3)
- suggestions (list): List of improvement suggestions (max 2)
self_reflect_loop
Run self-reflection loop to iteratively improve answer quality.Original user query
Initial answer to refine
Retrieved context for grounding
Maximum refinement iterations
Target quality score (0-100) to stop refinement early
Final refined answer after iterative improvement
ContextCompressor
Compress and summarize retrieved context to reduce token usage and improve answer quality.Constructor
LLM model for compression (Groq API)
Groq API key. Falls back to GROQ_API_KEY environment variable
Methods
compress
Compress context using specified compression technique.Retrieved context to compress
Original query for relevance filtering
Compression technique: “abstractive”, “extractive”, or “relevance_filter”
Additional parameters:
- max_tokens (int): For abstractive compression (default: 2048)
- num_sentences (int): For extractive compression (default: 5)
- relevance_threshold (float): For relevance filtering (default: 0.5)
Compressed context text
compress_abstractive
Abstractive compression using LLM summarization.Context to compress
Query for relevance
Maximum tokens in compressed output
Compressed context summary
compress_extractive
Extractive compression by selecting key sentences.Context to compress
Query for relevance
Number of sentences to extract
Selected sentences joined together
filter_by_relevance
Filter context chunks by relevance threshold.Context to filter
Query for relevance scoring
Minimum relevance score (0-1) to keep chunks
Filtered context containing only relevant chunks
QueryEnhancer
Enhance queries using multi-query generation and query expansion techniques.Constructor
LLM model for query enhancement
API key. Falls back to GROQ_API_KEY environment variable
Methods
generate_multi_queries
Generate multiple query variations for better retrieval coverage.Original query
Number of query variations to generate
List of query variations including the original
ResultMerger
Merge and deduplicate results from multiple retrieval sources.Methods
merge_results
Merge document results using specified strategy.List of result lists from different sources
Merging strategy: “rrf” (Reciprocal Rank Fusion) or “interleave”
Number of results to return after merging
Merged and deduplicated document list