Architecture Overview
The engine uses three core performance components:Token Bucket Rate Limiter
Smooth API request distribution with burst capacity
Redis Deduplication
Prevent duplicate evidence across sources
Circuit Breaker
Automatic source failure protection
Parallel Processing Configuration
Source Pool Concurrency
The pipeline creates separate semaphore pools for each data source to maximize parallelization:backend/app/services/pipeline.py
Performance Impact: Each source pool runs
max_parallel_searches queries simultaneously. With 3 sources and max_parallel_searches=20, the engine can process 60 concurrent requests.Optimizing Concurrency Levels
Monitor API Rate Limits
Check your API quotas:
- Tavily: 500 RPM (requests per minute)
- Gemini: 2000 RPM
- NewsAPI: 300 RPM (free tier: 1,000/day)
Rate Limiting Strategy
Token Bucket Algorithm
The engine uses a token bucket rate limiter for smooth request distribution:backend/app/decorators/api_rate_limiter.py
Configuring API-Specific Rate Limits
Redis Caching Strategy
Evidence Deduplication
Redis prevents duplicate evidence from being processed across multiple sources:backend/app/db/redis_cache.py
Cache Performance Benefits
Cache Hit Scenarios
Cache Hit Scenarios
- URL Deduplication: Same URL found by multiple sources
- Title Matching: Articles with identical titles from different outlets
- Content Similarity: Duplicate snippets across sources
Performance Gain: Redis set membership checks are O(1), making deduplication extremely fast even with thousands of evidence items.
Cache Cleanup Strategy
Cache Cleanup Strategy
The pipeline clears cache per domain before analysis:This ensures fresh data for each research run.
backend/app/services/pipeline.py
Circuit Breaker Protection
Automatic Failure Handling
Circuit breakers protect against cascading failures when API sources become unreliable:backend/app/core/circuit_breaker.py
Circuit Breaker States
Configuring Circuit Breakers
backend/app/core/config.py
Metrics and Monitoring
Pipeline Metrics
Track performance with theRunMetrics class:
backend/app/core/metrics.py
Performance Benchmarks
Standard Configuration
- Domains: 10
- Parallel Searches: 20
- Search Depth: standard
- Processing Time: ~3-5 seconds
- Queries/Second: ~40-60 QPS
High-Performance Configuration
- Domains: 50
- Parallel Searches: 30
- Search Depth: comprehensive
- Processing Time: ~8-12 seconds
- Queries/Second: ~100-150 QPS
Optimization Checklist
API Configuration
API Configuration
- Set appropriate
max_parallel_searchesfor your API limits - Configure
tavily_rpm,gemini_rpm,newsapi_rpmin settings - Enable retry logic with exponential backoff
- Monitor rate limit errors in logs
Caching
Caching
- Redis running on localhost:6379
- Monitor cache hit rates
- Clear cache between research runs if needed
- Consider Redis persistence for production
Circuit Breakers
Circuit Breakers
- Set
circuit_breaker_failuresbased on source reliability - Configure
circuit_breaker_reset_secondsfor quick recovery - Monitor breaker state changes in metrics
- Implement alerting for open circuits
Parallel Processing
Parallel Processing
- Use separate source pools for independent scaling
- Leverage async/await for I/O-bound operations
- Run CPU-intensive tasks (TF-IDF) in thread pools
- Monitor asyncio task completion rates
Advanced Performance Tuning
TF-IDF Job Matching Optimization
The jobs search source uses scikit-learn for semantic matching:backend/app/sources/jobs_search.py
Search Depth Optimization
Balance result quality vs. API usage:| Search Depth | Results per Source | Total API Calls (10 domains, 5 strategies) |
|---|---|---|
quick | 2 | 100 calls |
standard | 3 | 150 calls |
comprehensive | 5 | 250 calls |
The
search_depth parameter controls both Tavily’s search quality and the number of results returned per query.Troubleshooting Performance Issues
Slow Research Runs
High API Costs
Next Steps
Custom Sources
Build custom data sources for proprietary APIs
Troubleshooting
Debug common issues and errors