Directory Layout
Finance Agent is organized into distinct modules for AI agent logic, API services, and frontend presentation:Core Modules
Agent System (agent/)
The heart of Finance Agent - implements Retrieval-Augmented Generation (RAG) with semantic data source routing, research planning, and iterative self-improvement.
Key Components:
__init__.py- Public API exports (Agent,RAGAgent,create_agent())agent_config.py- Agent configuration and iteration settingsprompts.py- Centralized LLM prompt templates (including planning)llm/- Unified LLM client supporting OpenAI and Cerebras providers
RAG Implementation (agent/rag/)
The RAG pipeline orchestrates retrieval from multiple data sources and iterative answer improvement.
Core Files:
rag_agent.py- Orchestration engine with 6-stage pipelinequestion_analyzer.py- LLM-based semantic routing to data sourcesresponse_generator.py- LLM response generation, evaluation, planningsearch_planner.py- Search plan generation and temporal reference resolutionrag_flow_context.py- Flow context dataclass for pipeline state
search_engine.py- Hybrid vector + keyword search for earnings transcriptssec_filings_service_smart_parallel.py- Planning-driven parallel retrieval for SEC 10-K filingstavily_service.py- Real-time news via Tavily APIearnings_transcript_service.py- Dedicated earnings transcript retrieval agent
database_manager.py- PostgreSQL/pgvector operationsconversation_memory.py- Multi-turn conversation stateconfig.py- RAG configuration (models, weights, thresholds)
Data Ingestion (agent/rag/data_ingestion/)
Pipelines for downloading and processing earnings transcripts and SEC 10-K filings into the vector database.
See agent/rag/data_ingestion/README.md for detailed ingestion instructions.
API Application (app/)
FastAPI server providing HTTP endpoints for chat, company search, transcript retrieval, and financial screening.
Structure:
routers/- API endpoint definitionsschemas/- Pydantic models for request/response validation
POST /message/stream-v2- Chat with streaming RAG responsesGET /companies/search- Search companies by ticker/nameGET /transcript/{ticker}/{year}/{quarter}- Get specific earnings transcriptPOST /screener/query/stream- Natural language financial queries
Frontend (frontend/)
React + TypeScript single-page application with Tailwind CSS styling.
Technologies:
- React 19.2 with React Router for navigation
- TypeScript 5.9 for type safety
- Vite 7.2 for build tooling
- Tailwind CSS 4.1 for styling
- Clerk for authentication
- Framer Motion for animations
- React Markdown for formatted responses
Financial Screener (agent/screener/)
Natural language query interface for company fundamentals (in development).
Documentation
| Document | Description |
|---|---|
| agent/README.md | Complete agent architecture, pipeline stages, semantic routing, iterative self-improvement |
| docs/SEC_AGENT.md | SEC 10-K agent: planning-driven retrieval, 91% accuracy on FinanceBench |
| agent/rag/data_ingestion/README.md | Data ingestion pipelines for transcripts and SEC filings |
Configuration Files
.env- Environment variables (API keys, database URLs, auth settings)requirements.txt- Python dependenciesfrontend/package.json- Frontend dependenciesagent/agent_config.py- Agent iteration and quality threshold settingsagent/rag/config.py- RAG configuration (models, weights, search parameters)
Key Design Principles
- Modular Architecture - Clear separation between agent logic, API layer, and frontend
- Semantic Routing - Intent-based data source selection rather than keyword matching
- Iterative Improvement - Self-reflection loop with quality evaluation
- Multi-Source RAG - Combines earnings transcripts, SEC 10-K filings, and real-time news
- Parallel Processing - Multi-ticker queries execute in parallel for performance
- Streaming Events - Real-time progress updates for transparency
Next Steps
Tech Stack
Learn about the technologies powering Finance Agent
Database Schema
Explore the PostgreSQL schema and vector embeddings