llm-router crate provides dynamic model selection based on task complexity, multi-provider LLM integration, and usage tracking across 10+ providers.
llm::route
Route to optimal model based on message complexity and tool requirements.Array of message objects (analyzes last message for complexity)
Array of available tool definitions
Optional preferred model name (e.g., “opus”, “sonnet”, “haiku”, “gpt-4o”, “gemini”)
Selected provider name (e.g., “anthropic”, “openai”, “google”)
Selected model identifier
Computed complexity score
Complexity Scoring
The router computes complexity based on:- Content Length: +1 per 100 characters
- Code Indicators: +20 if contains
```,function, orclass - Analysis Keywords: +15 if contains
analyze,compare, ordesign - Tool Count: +5 per tool
- Conversation Length: +10 if > 10 messages
Model Selection by Complexity
| Complexity Score | Selected Model |
|---|---|
| 0-10 | claude-haiku-4-5-20251001 |
| 11-40 | claude-sonnet-4-20250514 |
| 41+ | claude-opus-4-20250514 |
Preferred Model Overrides
Specifymodel parameter to override auto-selection:
"opus"or"claude-opus"→ claude-opus-4-20250514"sonnet"or"claude-sonnet"→ claude-sonnet-4-20250514"haiku"or"claude-haiku"→ claude-haiku-4-5-20251001"gpt-4o"→ gpt-4o (OpenAI)"gemini"→ gemini-2.0-flash (Google)
llm::complete
Send completion request to routed provider with tool support.Provider name (from llm::route or manual selection)
Model identifier to use
Conversation history with role/content pairs
Available tools for function calling
Maximum tokens to generate
Generated text content
Model used for completion
Array of tool calls requested by model
Usage Tracking
The router automatically tracks token usage per provider:model combination:- Increments
input_tokens,output_tokens, andrequestscounters - Stored in-memory using DashMap for thread-safe access
- Query via
llm::usagefunction
Provider Drivers
Supports multiple driver types:- Anthropic: Native Anthropic Messages API
- OpenAiCompat: OpenAI-compatible endpoints (OpenAI, Groq, DeepSeek, Mistral, etc.)
- Gemini: Google Generative Language API
- Bedrock: AWS Bedrock (future support)
llm::usage
Get aggregated usage statistics across all providers and models.Get Usage Stats
llm::providers
List all available providers with configuration status.List Providers
Supported Providers
The router supports 10 providers out of the box:Anthropic
Base URL: https://api.anthropic.com
Env Key: ANTHROPIC_API_KEY
Models: claude-opus-4, claude-sonnet-4, claude-haiku-4-5
Env Key: ANTHROPIC_API_KEY
Models: claude-opus-4, claude-sonnet-4, claude-haiku-4-5
OpenAI
Base URL: https://api.openai.com/v1
Env Key: OPENAI_API_KEY
Models: gpt-4o, gpt-4o-mini, o1, o3-mini
Env Key: OPENAI_API_KEY
Models: gpt-4o, gpt-4o-mini, o1, o3-mini
Base URL: https://generativelanguage.googleapis.com/v1beta
Env Key: GOOGLE_API_KEY
Models: gemini-2.0-flash, gemini-2.0-pro
Env Key: GOOGLE_API_KEY
Models: gemini-2.0-flash, gemini-2.0-pro
Groq
Base URL: https://api.groq.com/openai/v1
Env Key: GROQ_API_KEY
Models: llama-3.3-70b-versatile, mixtral-8x7b-32768
Env Key: GROQ_API_KEY
Models: llama-3.3-70b-versatile, mixtral-8x7b-32768
DeepSeek
Base URL: https://api.deepseek.com/v1
Env Key: DEEPSEEK_API_KEY
Models: deepseek-chat, deepseek-reasoner
Env Key: DEEPSEEK_API_KEY
Models: deepseek-chat, deepseek-reasoner
Together
Base URL: https://api.together.xyz/v1
Env Key: TOGETHER_API_KEY
Models: Llama-3.3-70B, Mixtral-8x22B
Env Key: TOGETHER_API_KEY
Models: Llama-3.3-70B, Mixtral-8x22B
Mistral
Base URL: https://api.mistral.ai/v1
Env Key: MISTRAL_API_KEY
Models: mistral-large-latest, mistral-small-latest
Env Key: MISTRAL_API_KEY
Models: mistral-large-latest, mistral-small-latest
Fireworks
Base URL: https://api.fireworks.ai/inference/v1
Env Key: FIREWORKS_API_KEY
Models: llama-v3p3-70b-instruct
Env Key: FIREWORKS_API_KEY
Models: llama-v3p3-70b-instruct
OpenRouter
Base URL: https://openrouter.ai/api/v1
Env Key: OPENROUTER_API_KEY
Models: anthropic/claude-opus-4, google/gemini-2.0-flash, etc.
Env Key: OPENROUTER_API_KEY
Models: anthropic/claude-opus-4, google/gemini-2.0-flash, etc.
Ollama
Base URL: http://localhost:11434/v1
Env Key: (none - local)
Models: llama3.3, qwen2.5, deepseek-r1
Env Key: (none - local)
Models: llama3.3, qwen2.5, deepseek-r1
Provider Configuration
Environment Variables
Set API keys via environment variables:Local Models (Ollama)
Ollama requires no API key and runs locally:Driver Implementation
Anthropic Driver
Uses native Anthropic Messages API:- Header:
x-api-key: {api_key} - Header:
anthropic-version: 2023-06-01 - POST to
/v1/messages - Supports tool use blocks in response
OpenAI-Compatible Driver
Works with OpenAI-compatible endpoints:- Header:
authorization: Bearer {api_key}(if key provided) - POST to
/chat/completions - Supports tool_calls in response
- Used by: OpenAI, Groq, DeepSeek, Together, Mistral, Fireworks, OpenRouter, Ollama
Response Normalization
The router normalizes responses from different providers:-
Content extraction:
- Anthropic:
content[0].text - OpenAI:
choices[0].message.content
- Anthropic:
-
Tool calls extraction:
- Anthropic: filters
contentblocks withtype: "tool_use" - OpenAI:
choices[0].message.tool_calls
- Anthropic: filters
-
Token usage:
- Anthropic:
usage.{input_tokens, output_tokens} - OpenAI:
usage.{prompt_tokens, completion_tokens}
- Anthropic: