Orchestrator

The Orchestrator is a specialized agent that provides centralized coordination, intelligent routing, and execution planning for complex multi-agent workflows. It acts as a traffic controller, analyzing user requests and delegating work to the most appropriate agents.

The Orchestrator complements Workflows by providing dynamic, LLM-driven routing as opposed to static DAG-based orchestration.

When to Use the Orchestrator

Choose the Orchestrator for:

Dynamic Routing

When the sequence of agents needed depends on request content and cannot be predetermined

Adaptive Planning

When the LLM needs to decide which agents to call based on intermediate results

Conversational Flow

When users might change direction mid-conversation, requiring flexible orchestration

Agent Discovery

When you want automatic selection from available agents based on capabilities

Choose Workflows for:

Deterministic, repeatable processes
Complex control flow (loops, conditionals, maps)
Guaranteed execution order
Compliance and audit requirements

Architecture

The Orchestrator is implemented as a standard agent with specialized tools:

Core Components

Agent Registry Access

The Orchestrator subscribes to agent discovery and maintains an up-to-date registry of available agents and their capabilities

Capability Matching

Uses agent cards (schemas, skills, descriptions) to match user requests to appropriate agents

Delegation Tools

Built-in MCP tools allow the LLM to invoke other agents via A2A protocol

Result Synthesis

Aggregates responses from multiple agents into coherent final answer

Configuration

Basic Orchestrator Setup

orchestrator.yaml

name: orchestrator
namespace: acme/ai

log_level: info

model:
  model: anthropic/claude-4.5-sonnet
  max_output_tokens: 16384
  temperature: 0.3  # Lower temperature for more deterministic routing

instruction: |
  You are an intelligent orchestrator that coordinates multiple specialized agents.
  
  Your role:
  1. Analyze user requests to understand intent and requirements
  2. Identify which agents are needed based on their capabilities
  3. Plan the sequence of agent calls to fulfill the request
  4. Delegate subtasks to appropriate agents via agent_call tool
  5. Synthesize responses from multiple agents into coherent answers
  6. Handle errors and retry with alternative agents if needed
  
  Available agents are discovered automatically via the agent registry.
  Use the list_agents tool to see current capabilities.
  
  Always explain your routing decisions to the user.

tools:
  # Agent delegation tool
  - type: mcp
    name: agent_mesh_tools
    command: python
    args:
      - -m
      - solace_agent_mesh.tools.orchestrator_mcp
    env:
      AGENT_REGISTRY_CONFIG: /config/agent_registry.json

session_db:
  type: postgres
  connection_string: ${DATABASE_URL}

artifact_service:
  type: s3
  bucket: orchestrator-artifacts

agent_card:
  description: Intelligent orchestrator for multi-agent coordination
  version: 1.0.0
  skills:
    - name: orchestrate
      description: Coordinate multiple agents to fulfill complex requests
      input_schema:
        type: object
        properties:
          request:
            type: string
            description: User request to orchestrate

Orchestrator Tools

The Orchestrator uses specialized MCP tools for agent coordination:

list_agents

Query the agent registry to discover available agents:

{
  "name": "list_agents",
  "description": "List available agents and their capabilities",
  "inputSchema": {
    "type": "object",
    "properties": {
      "filter": {
        "type": "string",
        "description": "Filter agents by capability keywords"
      }
    }
  }
}

Example usage:

// Tool call
{
  "name": "list_agents",
  "arguments": {
    "filter": "web search"
  }
}

// Tool result
{
  "agents": [
    {
      "name": "research_agent",
      "description": "Research assistant with web search",
      "skills": [
        {
          "name": "web_search",
          "description": "Search the web for information"
        }
      ]
    }
  ]
}

call_agent

Delegate a subtask to another agent:

{
  "name": "call_agent",
  "description": "Call another agent to perform a subtask",
  "inputSchema": {
    "type": "object",
    "properties": {
      "agent_name": {
        "type": "string",
        "description": "Name of the agent to call"
      },
      "message": {
        "type": "string",
        "description": "Task description for the agent"
      },
      "context": {
        "type": "object",
        "description": "Additional context or artifacts to pass"
      }
    },
    "required": ["agent_name", "message"]
  }
}

Example orchestration:

// Step 1: Research
{
  "name": "call_agent",
  "arguments": {
    "agent_name": "research_agent",
    "message": "Find the latest information about AI agent frameworks"
  }
}

// Step 2: Analysis (using research results)
{
  "name": "call_agent",
  "arguments": {
    "agent_name": "analysis_agent",
    "message": "Analyze the following research findings and identify trends",
    "context": {
      "research_results": "{{previous_result}}"
    }
  }
}

// Step 3: Visualization
{
  "name": "call_agent",
  "arguments": {
    "agent_name": "visualization_agent",
    "message": "Create a comparison chart of the agent frameworks",
    "context": {
      "data": "{{analysis_result}}"
    }
  }
}

Execution Flow

Here’s how the Orchestrator handles a complex request:

Orchestration Patterns

Sequential Delegation

Call agents one after another, passing results forward:

User Request
  ↓
Research Agent → results
  ↓
Analysis Agent → insights (using results)
  ↓
Writing Agent → report (using insights)
  ↓
Final Response

Use when: Tasks have clear dependencies and must run in order

Parallel Delegation

Call multiple agents concurrently for independent tasks:

User Request
  ↓
┌─────────┬─────────┬─────────┐
│ Agent 1 │ Agent 2 │ Agent 3 │ (parallel)
└─────────┴─────────┴─────────┘
  ↓         ↓         ↓
Synthesis Agent
  ↓
Final Response

Use when: Subtasks are independent and can run concurrently

Conditional Routing

Choose agents dynamically based on intermediate results:

User Request
  ↓
Classifier Agent
  ↓
┌─────────────┬─────────────┐
│ Category A  │ Category B  │
├─────────────┼─────────────┤
│ Specialist  │ Specialist  │
│ Agent A     │ Agent B     │
└─────────────┴─────────────┘
  ↓
Final Response

Use when: Route depends on request analysis or classification

Iterative Refinement

Loop with an agent until quality threshold met:

User Request
  ↓
Generator Agent → draft
  ↓
Critic Agent → feedback
  ↓
┌─────────────────────┐
│ If not acceptable   │
│   └→ Generator (retry)
└─────────────────────┘
  ↓
Final Response

Use when: Output requires iterative improvement

Agent Registry Integration

The Orchestrator maintains an active agent registry through discovery:

# Orchestrator subscribes to agent discovery
subscriptions = [
    {"topic": a2a.get_discovery_subscription_topic(namespace)}
]

# When agent cards are received:
def process_agent_card(agent_card: AgentCard):
    """
    Store agent metadata for routing decisions:
    - Agent name and description
    - Skills and capabilities (input/output schemas)
    - Required scopes for access control
    - Version and availability status
    """
    agent_registry.add_or_update_agent(agent_card)

Agent Card Schema

Agents advertise capabilities via agent cards:

{
  "name": "data_analysis_agent",
  "description": "Performs statistical analysis on datasets",
  "version": "2.1.0",
  "skills": [
    {
      "name": "analyze_csv",
      "description": "Analyze CSV data and generate insights",
      "input_schema": {
        "type": "object",
        "properties": {
          "csv_file": {
            "type": "string",
            "description": "Artifact reference to CSV file"
          },
          "analysis_type": {
            "type": "string",
            "enum": ["descriptive", "correlation", "regression"]
          }
        },
        "required": ["csv_file"]
      },
      "output_schema": {
        "type": "object",
        "properties": {
          "insights": {"type": "string"},
          "visualizations": {
            "type": "array",
            "items": {"type": "string"}
          }
        }
      }
    }
  ],
  "extensions": {
    "required_scopes": ["data_access"],
    "max_concurrent_tasks": 5
  }
}

The Orchestrator uses this metadata to:

Match user intent to agent capabilities
Validate input/output compatibility between agents
Check permission requirements
Provide intelligent suggestions to users

Error Handling & Recovery

The Orchestrator implements robust error handling:

Agent Failure Recovery

instruction: |
  Error handling guidelines:
  
  1. If an agent call fails:
     - Check the error message for actionable issues
     - Consider calling an alternative agent with similar capabilities
     - Break down the task into smaller pieces if too complex
     
  2. If an agent is unavailable:
     - Use list_agents to find alternatives
     - Explain the situation to the user
     - Offer to proceed with available agents or wait
     
  3. If agent output is incomplete or malformed:
     - Retry the same agent with more specific instructions
     - Use a validation agent to check quality
     - Fall back to a simpler approach if needed

Timeout Handling

Orchestrator can set timeouts for agent calls:

{
  "name": "call_agent",
  "arguments": {
    "agent_name": "slow_research_agent",
    "message": "Deep research on topic",
    "timeout_seconds": 120
  }
}

If timeout exceeded:

Orchestrator receives timeout error
Can decide to retry, use partial results, or switch agents
Notifies user about delays

Advanced Features

Context Propagation

Pass artifacts and context between agents:

// First agent creates artifact
{
  "name": "call_agent",
  "arguments": {
    "agent_name": "data_collector",
    "message": "Collect sales data for Q4"
  }
}
// Returns: {"artifact_ref": "sales_data_q4.csv"}

// Second agent uses artifact
{
  "name": "call_agent",
  "arguments": {
    "agent_name": "data_analysis_agent",
    "message": "Analyze the sales data",
    "context": {
      "input_file": "{{ARTIFACT:sales_data_q4.csv}}"
    }
  }
}

User Interaction

Orchestrator can ask for user input during execution:

I've found three potential approaches:

1. Use the research agent for web search (faster, less comprehensive)
2. Use the academic agent for scholarly sources (slower, more authoritative)
3. Use both in parallel and combine results

Which would you prefer?

User’s choice influences subsequent routing decisions.

Cost Optimization

Orchestrator can optimize for cost:

instruction: |
  Cost-aware routing:
  - Prefer smaller/faster models for simple tasks
  - Use expensive models only when quality is critical
  - Consider caching results for similar requests
  - Batch requests when possible to reduce overhead

Monitoring & Observability

Track Orchestrator performance:

Agent Call Metrics

Number of agent invocations per request
Success/failure rates by agent
Average latency per agent
Cost per orchestration session

Routing Analytics

Most frequently called agent combinations
Common routing patterns
Failed routing attempts
User satisfaction by routing strategy

Logging Best Practices

Structured logging for orchestration decisions:

log.info(
    "Orchestration plan",
    extra={
        "user_request": user_message,
        "planned_agents": ["research_agent", "analysis_agent"],
        "estimated_steps": 3,
        "session_id": session_id,
    }
)

log.info(
    "Agent call result",
    extra={
        "agent_name": "research_agent",
        "status": "success",
        "latency_ms": 2341,
        "artifact_count": 2,
    }
)

Orchestrator vs Workflow Comparison

Feature	Orchestrator	Workflow
Routing Logic	Dynamic (LLM-driven)	Static (predefined DAG)
Flexibility	Adapts to request content	Fixed structure
Predictability	Variable execution path	Deterministic
Complexity	Simpler config, complex behavior	Complex config, predictable
Use Case	Conversational, exploratory	Production, compliance
Error Recovery	LLM decides strategy	Predefined retry policies
Cost	Variable (LLM calls)	Fixed (DAG execution)
Control Flow	Inferred by LLM	Explicit (loops, branches)

Best Practices

Clear Agent Descriptions

Ensure all agents have:

Precise capability descriptions
Well-defined input/output schemas
Example use cases in agent card
Clear scope boundaries

This helps the Orchestrator make accurate routing decisions.

Orchestrator Instruction Design

Provide explicit routing guidelines
Include examples of successful orchestrations
Define fallback strategies for failures
Set clear quality criteria for results

Testing Routing Logic

Test orchestration with:

Simple single-agent requests
Complex multi-agent workflows
Edge cases (missing agents, timeouts)
User clarification scenarios
Performance under load

Performance Optimization

Use agent capability caching
Implement request deduplication
Set appropriate timeouts
Monitor and optimize slow paths
Consider parallel execution where safe

Limitations & Considerations

Key Limitations:

Non-determinism: Same request may use different agents on retry
LLM Costs: Each routing decision consumes LLM tokens
Latency: Sequential LLM calls add overhead
Debugging: Dynamic routing harder to debug than static workflows

When NOT to use Orchestrator:

Regulatory compliance requiring audit trails
Real-time systems needing predictable latency
High-volume batch processing
Simple linear workflows with known steps

Next Steps

Workflows

Learn about static DAG-based orchestration

A2A Protocol

Deep dive into agent communication

Agents

Build specialized agents for orchestration

Agent Cards

Design effective capability advertisements

Getting Started

Installation & Configuration

Core Concepts

Components

Built-in Tools

Developer Guides

Deployment

Enterprise Features

​Orchestrator

​When to Use the Orchestrator

Dynamic Routing

Adaptive Planning

Conversational Flow

Agent Discovery

​Architecture

​Core Components

​Configuration

​Basic Orchestrator Setup

​Orchestrator Tools

​list_agents

​call_agent

​Execution Flow

​Orchestration Patterns

​Agent Registry Integration

​Agent Card Schema

​Error Handling & Recovery

​Agent Failure Recovery

​Timeout Handling

​Advanced Features

​Context Propagation

​User Interaction

​Cost Optimization

​Monitoring & Observability

Agent Call Metrics

Routing Analytics

​Logging Best Practices

​Orchestrator vs Workflow Comparison

​Best Practices

​Limitations & Considerations

​Next Steps

Workflows

A2A Protocol

Agents

Agent Cards

Build docs developers (and LLMs) love

Orchestrator

When to Use the Orchestrator

Architecture

Core Components

Configuration

Basic Orchestrator Setup

Orchestrator Tools

list_agents

call_agent

Execution Flow

Orchestration Patterns

Agent Registry Integration

Agent Card Schema

Error Handling & Recovery

Agent Failure Recovery

Timeout Handling

Advanced Features

Context Propagation

User Interaction

Cost Optimization

Monitoring & Observability

Logging Best Practices

Orchestrator vs Workflow Comparison

Best Practices

Limitations & Considerations

Next Steps