System Architecture
The platform consists of three primary layers:Frontend Layer
React 18 with TypeScript and Material-UI v7
Backend Layer
FastAPI-powered research engine with async processing
Data Sources
Multi-source aggregation (Google, News, Jobs)
High-Level Architecture Diagram
Core Design Principles
1. Asynchronous Processing
The entire backend is built on Python’sasyncio framework, enabling:
- Concurrent API requests to multiple data sources
- Non-blocking I/O operations for database and cache access
- Parallel research execution for multiple companies simultaneously
- Real-time streaming of results via Server-Sent Events (SSE)
The system can execute up to 20 parallel searches concurrently, configurable per request via
max_parallel_searches.2. Separation of Concerns
The architecture follows clean separation patterns:| Layer | Responsibility | Technology |
|---|---|---|
| Presentation | User interface and interaction | React, Material-UI |
| API | Request routing and validation | FastAPI |
| Business Logic | Research pipeline orchestration | Python async/await |
| Data Access | Multi-source data retrieval | HTTP clients, APIs |
| Infrastructure | Caching, metrics, monitoring | Redis, circuit breakers |
3. AI-Powered Intelligence
The system leverages AI at multiple stages:4. Resilience & Reliability
Built-in patterns for production reliability:Technology Stack
Backend Stack
- Framework: FastAPI 0.100+ (Python 3.11+)
- Async Runtime: asyncio with uvicorn
- Serialization: ORJSON for high-performance JSON
- Caching: Redis for result caching and session management
- AI: Gemini 2.5 Flash for query generation and analysis
- Validation: Pydantic v2 for request/response models
Frontend Stack
- Framework: React 18 with TypeScript
- UI Library: Material-UI (MUI) v7
- Build Tool: Vite 5.x
- HTTP Client: Fetch API with streaming support
- State Management: React hooks (useState, custom hooks)
Infrastructure
- API Protocol: REST with SSE for streaming
- CORS: Configured for cross-origin requests
- Monitoring: Performance metrics and error tracking
- Deployment: Docker-ready with health checks
Request Flow
Here’s how a typical research request flows through the system:For real-time updates, use the
/research/batch/stream endpoint which sends incremental results via Server-Sent Events.Performance Characteristics
Typical Research Performance
- Query Generation: 0.5-1.5 seconds
- Evidence Collection: 5-15 seconds (depends on parallelism)
- AI Analysis: 2-5 seconds per company
- Total Processing Time: 10-25 seconds for 3-5 companies
Scalability Considerations
Horizontal Scaling
FastAPI workers can scale independently behind a load balancer
Rate Limiting
Per-source semaphores prevent API quota exhaustion
Caching
Redis caching reduces redundant API calls
Circuit Breakers
Automatic failover when sources become unavailable
Directory Structure
Backend Structure
Frontend Structure
Configuration
Key configuration settings defined inbackend/app/core/config.py:
Next Steps
Data Flow
Learn how data flows through the system
Backend Architecture
Deep dive into backend components
Frontend Architecture
Explore frontend structure and patterns
API Reference
View complete API documentation