Skip to main content
Model routing lets a single model entry act as a smart router that automatically selects the best underlying model for each request. This is useful for cost optimization, specialized handling, and load distribution.
docker-agent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples handles the request.

Configuration

Add routing rules to any model definition. The model’s provider/model fields become the fallback when no route matches:
models:
  smart_router:
    # Fallback model — used when no routing rule matches
    provider: openai
    model: gpt-4o-mini

    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write a detailed technical document"
          - "Help me architect this system"
          - "Review this code for security issues"
          - "Explain this complex algorithm"

      - model: openai/gpt-4o
        examples:
          - "Generate some creative ideas"
          - "Write a story about"
          - "Help me brainstorm"
          - "Come up with names for"

      - model: openai/gpt-4o-mini
        examples:
          - "What time is it"
          - "Convert this to JSON"
          - "Simple math calculation"
          - "Translate this word"

agents:
  root:
    model: smart_router
    description: Assistant with intelligent model routing
    instruction: You are a helpful assistant.

Routing rule fields

model
string
required
Target model for this route. Can be an inline provider/model string or the name of a model defined in the models section.
examples
string[]
required
Example phrases that should route to this model. The router uses full-text similarity to match incoming messages against these examples.

Matching behavior

The router:
  1. Extracts the last user message from the conversation
  2. Searches all examples using full-text search
  3. Aggregates match scores per route (best score per route wins)
  4. Selects the route with the highest overall score
  5. Falls back to the base provider/model if no good match is found
Write good examples by:
  • Using diverse phrasing that captures the intent
  • Including keywords users actually use
  • Adding 5–10 examples per route for best results
  • Note that examples don’t need to be exact matches — the router uses semantic similarity

Named model references

Routes can reference named models from the models section, allowing full parameter control per route:
models:
  router:
    provider: anthropic
    model: claude-haiku-4-5    # fallback
    routing:
      - model: fast             # references named model below
        examples:
          - "hello"
          - "hi there"
          - "thanks"
          - "bye"

      - model: capable          # references named model below
        examples:
          - "explain the algorithm"
          - "implement a function"
          - "debug this code"
          - "write a program"

  capable:
    provider: anthropic
    model: claude-sonnet-4-5

  fast:
    provider: anthropic
    model: claude-haiku-4-5

agents:
  root:
    model: router
    description: Intelligent assistant with automatic model selection
    instruction: You are a helpful AI assistant.

Use cases

Cost optimization

Route simple queries to cheaper models:
models:
  cost_optimizer:
    provider: openai
    model: gpt-4o-mini  # cheap fallback for unmatched queries
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Complex analysis needed"
          - "Multi-step reasoning"
          - "Detailed research"
          - "Architecture review"

Specialized models

Route different task types to the most capable model for that domain:
models:
  task_router:
    provider: openai
    model: gpt-4o  # general fallback
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write code"
          - "Debug this function"
          - "Review my implementation"
          - "Fix this bug"
      - model: openai/gpt-4o
        examples:
          - "Write a blog post"
          - "Help me with writing"
          - "Summarize this document"

Load balancing

Distribute load across equivalent models from different providers:
models:
  load_balancer:
    provider: openai
    model: gpt-4o
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "First request pattern"
          - "Another request type"
      - model: google/gemini-2.5-flash
        examples:
          - "Different request pattern"
          - "Alternative query style"

Debugging routing decisions

Enable debug logging to see which model was selected and why:
docker agent run config.yaml --debug
Look for log entries like:
Rule-based router selected model  router=smart_router  selected_model=anthropic/claude-sonnet-4-0
Route matched  model=anthropic/claude-sonnet-4-0  score=2.45
Limitations:
  • Routing only considers the last user message, not the full conversation context
  • Very short messages may not match well — consider your fallback model carefully
  • Each routed model creates a separate provider connection

Build docs developers (and LLMs) love