Overview
The routing system analyzes incoming requests and assigns a complexity score (0-100+) based on multiple factors. This score determines which tier of model to use:- Low complexity (0-10): Route to fast tier (Haiku, GPT-4o mini)
- Medium complexity (11-40): Route to smart tier (Sonnet, GPT-4o)
- High complexity (41+): Route to frontier tier (Opus, o3)
Architecture
Routing is implemented in two layers:TypeScript Router (src/llm-router.ts)
Rust Router (crates/llm-router/src/main.rs)
Complexity Scoring
The system analyzes multiple dimensions to calculate complexity:Message Length
Code Detection
Keyword Analysis
Tool Count
Conversation History
Final Normalization
Scoring Examples
Low Complexity (0-10) → Haiku/Mini
Low Complexity (0-10) → Haiku/Mini
Medium Complexity (11-40) → Sonnet/GPT-4o
Medium Complexity (11-40) → Sonnet/GPT-4o
High Complexity (41+) → Opus/o3
High Complexity (41+) → Opus/o3
Manual Override
You can bypass routing and specify a model directly:Cost Optimization
The routing system optimizes costs by:- Avoiding over-provisioning: Simple tasks use fast/cheap models
- Automatic scaling: Complex tasks get frontier models
- Usage tracking: All calls tracked with cost attribution
Cost Comparison Example
Fallback & Retry
The completion system includes automatic retry logic:- Attempt 1: Immediate
- Attempt 2: Wait 2s (if 429 rate limit)
- Attempt 3: Wait 4s (if 429 rate limit)
- Failure: Throw last error
Provider Drivers
The Rust router supports multiple provider drivers:- Authentication (API keys, AWS credentials)
- Request formatting (messages, tools, system prompts)
- Response parsing (content, tool calls, usage)
- Error handling (rate limits, timeouts)
Usage Statistics
Best Practices
Trust the Router
Let complexity scoring choose the model for most use cases
Monitor Costs
Review usage stats regularly to identify optimization opportunities
Use Overrides Sparingly
Only override routing for specialized tasks (e.g., always use Sonar for search)
Tune for Your Domain
Adjust complexity thresholds based on your specific workload
Customizing Routing
You can customize the routing logic by:1. Adjusting Thresholds
2. Adding Custom Scoring
3. Provider-Specific Routing
Performance Metrics
Typical routing overhead:- Complexity scoring: Less than 1ms
- Model selection: Less than 1ms
- Total routing latency: Less than 5ms
Next Steps
Provider Setup
Configure API keys for all providers
Model Catalog
Browse all 47 available models