Model routing lets a single model entry act as a smart router that automatically selects the best underlying model for each request. This is useful for cost optimization, specialized handling, and load distribution.
docker-agent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples handles the request.
Configuration
Add routing rules to any model definition. The model’s provider/model fields become the fallback when no route matches:
models:
smart_router:
# Fallback model — used when no routing rule matches
provider: openai
model: gpt-4o-mini
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Write a detailed technical document"
- "Help me architect this system"
- "Review this code for security issues"
- "Explain this complex algorithm"
- model: openai/gpt-4o
examples:
- "Generate some creative ideas"
- "Write a story about"
- "Help me brainstorm"
- "Come up with names for"
- model: openai/gpt-4o-mini
examples:
- "What time is it"
- "Convert this to JSON"
- "Simple math calculation"
- "Translate this word"
agents:
root:
model: smart_router
description: Assistant with intelligent model routing
instruction: You are a helpful assistant.
Routing rule fields
Target model for this route. Can be an inline provider/model string or the name of a model defined in the models section.
Example phrases that should route to this model. The router uses full-text similarity to match incoming messages against these examples.
Matching behavior
The router:
- Extracts the last user message from the conversation
- Searches all examples using full-text search
- Aggregates match scores per route (best score per route wins)
- Selects the route with the highest overall score
- Falls back to the base
provider/model if no good match is found
Write good examples by:
- Using diverse phrasing that captures the intent
- Including keywords users actually use
- Adding 5–10 examples per route for best results
- Note that examples don’t need to be exact matches — the router uses semantic similarity
Named model references
Routes can reference named models from the models section, allowing full parameter control per route:
models:
router:
provider: anthropic
model: claude-haiku-4-5 # fallback
routing:
- model: fast # references named model below
examples:
- "hello"
- "hi there"
- "thanks"
- "bye"
- model: capable # references named model below
examples:
- "explain the algorithm"
- "implement a function"
- "debug this code"
- "write a program"
capable:
provider: anthropic
model: claude-sonnet-4-5
fast:
provider: anthropic
model: claude-haiku-4-5
agents:
root:
model: router
description: Intelligent assistant with automatic model selection
instruction: You are a helpful AI assistant.
Use cases
Cost optimization
Route simple queries to cheaper models:
models:
cost_optimizer:
provider: openai
model: gpt-4o-mini # cheap fallback for unmatched queries
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Complex analysis needed"
- "Multi-step reasoning"
- "Detailed research"
- "Architecture review"
Specialized models
Route different task types to the most capable model for that domain:
models:
task_router:
provider: openai
model: gpt-4o # general fallback
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Write code"
- "Debug this function"
- "Review my implementation"
- "Fix this bug"
- model: openai/gpt-4o
examples:
- "Write a blog post"
- "Help me with writing"
- "Summarize this document"
Load balancing
Distribute load across equivalent models from different providers:
models:
load_balancer:
provider: openai
model: gpt-4o
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "First request pattern"
- "Another request type"
- model: google/gemini-2.5-flash
examples:
- "Different request pattern"
- "Alternative query style"
Debugging routing decisions
Enable debug logging to see which model was selected and why:
docker agent run config.yaml --debug
Look for log entries like:
Rule-based router selected model router=smart_router selected_model=anthropic/claude-sonnet-4-0
Route matched model=anthropic/claude-sonnet-4-0 score=2.45
Limitations:
- Routing only considers the last user message, not the full conversation context
- Very short messages may not match well — consider your fallback model carefully
- Each routed model creates a separate provider connection