Key Features
25 LLM Providers
Connect to major providers like Anthropic, OpenAI, Google, AWS Bedrock, DeepSeek, and 20 more
47 Models
Access frontier, smart, balanced, fast, and local tier models
Intelligent Routing
Automatic model selection based on complexity scoring
Cost Optimization
Built-in usage tracking and cost management
Architecture
The LLM system is implemented across two layers:TypeScript Router (src/llm-router.ts)
Handles OpenAI-compatible providers through a unified interface:
- Route selection via
llm::route - Completion handling via
llm::complete - Automatic retry with exponential backoff
- Cost tracking integration
Rust Router (crates/llm-router/src/main.rs)
High-performance routing engine:
- Complexity-based model selection
- Provider health monitoring
- Usage statistics aggregation
- Multi-driver support (Anthropic, OpenAI, Gemini, Bedrock)
Model Tiers
Models are organized into 5 tiers based on capability and cost:| Tier | Use Case | Example Models |
|---|---|---|
| Frontier | Most complex reasoning, research | Claude Opus 4.6, GPT o3, Gemini 2.5 Pro |
| Smart | Advanced tasks, coding | Claude Sonnet 4.6, GPT-4o, Grok-2 |
| Balanced | General purpose, cost-effective | DeepSeek Chat, Llama 3.3 70B, Command R |
| Fast | Quick responses, simple tasks | Claude Haiku 4.5, GPT-4o mini, Gemini 2.5 Flash |
| Local | Self-hosted, privacy-first | Ollama, vLLM, LM Studio |
Quick Start
Using Complexity-Based Routing
Direct Model Selection
Provider Configuration
Providers are configured via environment variables:CLI Usage
Cost Tracking
All completions automatically track usage and costs:Response Format
All LLM completions return a standardized format:Fallback & Retry
The router includes automatic retry logic:- 429 Rate Limit: Exponential backoff (1s, 2s, 4s)
- Network Errors: 3 attempts before failing
- Provider Outages: Manual fallback to alternate provider
Next Steps
Browse Providers
View all 25 supported providers
Explore Models
See all 47 available models with pricing
Routing Logic
Learn about complexity-based model selection