Overview
The agent executes a 6-stage pipeline for each question, with strategic parallelization and semantic routing to optimize performance and accuracy.Stage 1: Setup & Initialization
Initialize RAG components
- Load search engine (hybrid vector + keyword)
- Initialize response generator
- Connect to vector database (pgvector)
Load configuration
- Answer mode thresholds
- LLM provider settings (Cerebras/OpenAI)
- Hybrid search weights (70% semantic, 30% keyword)
Stage 2: Combined Reasoning + Analysis
Single LLM call viaReasoningPlanner that performs comprehensive question understanding.
Analysis Components (Single LLM Call)
Analysis Components (Single LLM Call)
Extracted Information
- Tickers - Company identifiers (
$AAPL,$MSFT) - Time references - Temporal phrases preserved exactly (“Q4 2024”, “last 3 quarters”, “latest”)
- Intent - What is the user trying to learn?
- Topic - Main subject (e.g., “cloud revenue growth”)
- Question type - Single company, multiple companies, or comparison
- Answer mode -
direct|standard|detailed - Validation - Reject off-topic/invalid questions
Semantic Data Source Routing
Routes based on intent, not keywords:Research Reasoning
Generates 2-3 sentence research approach:- Makes agent thinking transparent
- Guides evaluation (did we find what we planned to find?)
- Improves answer quality through structured research
Implementation Reference
This single LLM call replaces what used to be multiple sequential calls, significantly reducing latency.
Stage 2.1: Search Planning
SearchPlanner converts temporal references into concrete search plans.Quarter Resolution (Company-Specific)
Quarter Resolution (Company-Specific)
Each company gets its own most recent quarters (not global):Examples:
"latest"→get_last_n_quarters_for_company(ticker, 1)"last 3 quarters"→get_last_n_quarters_for_company(ticker, 3)"Q4 2024"→ Specific quarter validation
Declarative Search Plan
Declarative Search Plan
Builds search plan for each data source:
Stage 2.5: News Search
Conditional execution: Only ifneeds_latest_news=true
Stage 2.6: SEC 10-K Retrieval Agent
Conditional execution: Only ifdata_source="10k" or needs_10k=true
Invokes specialized retrieval agent for SEC 10-K annual filings.
See SEC Agent for complete documentation of this stage.
Key Features
Planning-Driven
Generates targeted sub-questions for retrieval
Section Routing
LLM-based routing to Item 1, Item 7, Item 8, etc.
Table Selection
LLM selects relevant tables from financial statements
Iterative Retrieval
Up to 5 iterations with self-evaluation
Flow Overview
Citation Format
Results formatted with[10K1], [10K2] citation markers for source attribution.
Stage 3: Transcript Search
Hybrid vector + keyword search over earnings call transcripts.- Single-Ticker
- Multi-Ticker
Direct search with quarter filtering:Scoring:
Database Query
Stage 4: Initial Answer Generation
- Single Ticker
- Multiple Tickers
- Original question
- Research reasoning from Stage 2
- All retrieved chunks
- Citation instructions
Stage 5: Iterative Improvement
Self-reflection loop with configurable depth based on answer mode.Iteration Loop Details
Iteration Loop Details
Evaluation Metrics
Scores (0-100 scale):Does the answer fully address the question?
Does it include specific numbers, quotes, and details?
Is the information factually correct based on sources?
Is the response well-structured and easy to understand?
Weighted combination (0-1 scale)
Follow-Up Actions
During iteration, the agent can:Generate Keyword Phrases
Search-optimized keywords (NOT verbose questions)Example:
"capex guidance 2025 AI allocation"Not: "What guidance did they provide for capex..."Request Transcript Search
needs_transcript_search: trueSearches ALL target quarters in parallelRequest News Search
needs_news_search: trueFetches real-time news updatesEvaluate Progress
Check if reasoning goals are met
Termination Conditions
Answer Mode Configuration
| Mode | Max Iterations | Confidence Threshold | Use Case |
|---|---|---|---|
direct | 2 | 70% | “What was Q4 revenue?” |
standard | 3 | 80% | “Explain cloud strategy” |
detailed | 4 | 90% | “Analyze margin trends” |
deep_search | 10 | 95% | Reserved for future use |
Stage 6: Final Response Assembly
Include source attributions
- Transcript citations:
[1],[2] - 10-K citations:
[10K1],[10K2] - News citations:
[N1],[N2]
Performance Optimization
Parallel Execution
Parallel Execution
Multiple independent operations run concurrently:
- Multi-ticker searches (one per company)
- 10-K sub-question searches (6 workers)
- Quarter searches (all target quarters)
- Follow-up keyword phrase searches
Strategic Caching
Strategic Caching
- Embedding cache for frequent queries
- Quarter availability cache (30 min TTL)
- LLM response caching for identical questions
Early Termination
Early Termination
- Stop iteration when confidence ≥ threshold
- 10-K agent stops at 90% quality (avg 2.4 iterations vs max 5)
- Avoid unnecessary searches when answer is complete
Smart Deduplication
Smart Deduplication
- Deduplicate chunks by citation marker
- Avoid retrieving same content multiple times
- Merge overlapping context windows
Next Steps
Iterative Improvement
Deep dive into self-reflection and evaluation
SEC Agent
Learn about the specialized 10-K retrieval agent