Model routing

Model routing lets a single model entry act as a smart router that automatically selects the best underlying model for each request. This is useful for cost optimization, specialized handling, and load distribution.

docker-agent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples handles the request.

Configuration

Add routing rules to any model definition. The model’s provider/model fields become the fallback when no route matches:

models:
  smart_router:
    # Fallback model — used when no routing rule matches
    provider: openai
    model: gpt-4o-mini

    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write a detailed technical document"
          - "Help me architect this system"
          - "Review this code for security issues"
          - "Explain this complex algorithm"

      - model: openai/gpt-4o
        examples:
          - "Generate some creative ideas"
          - "Write a story about"
          - "Help me brainstorm"
          - "Come up with names for"

      - model: openai/gpt-4o-mini
        examples:
          - "What time is it"
          - "Convert this to JSON"
          - "Simple math calculation"
          - "Translate this word"

agents:
  root:
    model: smart_router
    description: Assistant with intelligent model routing
    instruction: You are a helpful assistant.

Routing rule fields

model

string

required

Target model for this route. Can be an inline provider/model string or the name of a model defined in the models section.

examples

string[]

required

Example phrases that should route to this model. The router uses full-text similarity to match incoming messages against these examples.

Matching behavior

The router:

Extracts the last user message from the conversation
Searches all examples using full-text search
Aggregates match scores per route (best score per route wins)
Selects the route with the highest overall score
Falls back to the base provider/model if no good match is found

Write good examples by:

Using diverse phrasing that captures the intent
Including keywords users actually use
Adding 5–10 examples per route for best results
Note that examples don’t need to be exact matches — the router uses semantic similarity

Named model references

Routes can reference named models from the models section, allowing full parameter control per route:

models:
  router:
    provider: anthropic
    model: claude-haiku-4-5    # fallback
    routing:
      - model: fast             # references named model below
        examples:
          - "hello"
          - "hi there"
          - "thanks"
          - "bye"

      - model: capable          # references named model below
        examples:
          - "explain the algorithm"
          - "implement a function"
          - "debug this code"
          - "write a program"

  capable:
    provider: anthropic
    model: claude-sonnet-4-5

  fast:
    provider: anthropic
    model: claude-haiku-4-5

agents:
  root:
    model: router
    description: Intelligent assistant with automatic model selection
    instruction: You are a helpful AI assistant.

Use cases

Cost optimization

Route simple queries to cheaper models:

models:
  cost_optimizer:
    provider: openai
    model: gpt-4o-mini  # cheap fallback for unmatched queries
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Complex analysis needed"
          - "Multi-step reasoning"
          - "Detailed research"
          - "Architecture review"

Specialized models

Route different task types to the most capable model for that domain:

models:
  task_router:
    provider: openai
    model: gpt-4o  # general fallback
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write code"
          - "Debug this function"
          - "Review my implementation"
          - "Fix this bug"
      - model: openai/gpt-4o
        examples:
          - "Write a blog post"
          - "Help me with writing"
          - "Summarize this document"

Load balancing

Distribute load across equivalent models from different providers:

models:
  load_balancer:
    provider: openai
    model: gpt-4o
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "First request pattern"
          - "Another request type"
      - model: google/gemini-2.5-flash
        examples:
          - "Different request pattern"
          - "Alternative query style"

Debugging routing decisions

Enable debug logging to see which model was selected and why:

docker agent run config.yaml --debug

Look for log entries like:

Rule-based router selected model  router=smart_router  selected_model=anthropic/claude-sonnet-4-0
Route matched  model=anthropic/claude-sonnet-4-0  score=2.45

Limitations:

Routing only considers the last user message, not the full conversation context
Very short messages may not match well — consider your fallback model carefully
Each routed model creates a separate provider connection

Get Started

Core Concepts

Features

Configuration

Built-in Tools

Model Providers

Guides

Community

Configuration

Routing rule fields

Matching behavior

Named model references

Use cases

Cost optimization

Specialized models

Load balancing

Debugging routing decisions

Build docs developers (and LLMs) love

Get Started

Core Concepts

Features

Configuration

Built-in Tools

Model Providers

Guides

Community

​Configuration

​Routing rule fields

​Matching behavior

​Named model references

​Use cases

​Cost optimization

​Specialized models

​Load balancing

​Debugging routing decisions

Build docs developers (and LLMs) love

Configuration

Routing rule fields

Matching behavior

Named model references

Use cases

Cost optimization

Specialized models

Load balancing

Debugging routing decisions