Skip to main content

Agents

GAIA uses a sophisticated multi-agent architecture where specialized AI agents collaborate to handle different aspects of your tasks. Think of it as having a team of experts, each with deep knowledge of specific tools and domains.

Agent Architecture Overview

The Executor Agent is your primary interface. It decides when to handle tasks directly vs. delegating to specialized subagents.

Agent Types

1. Executor Agent (Main Agent)

The orchestrator that manages conversations and delegates to specialists.
# From: apps/api/app/agents/core/agent.py:117-161

async def call_agent(
    request: MessageRequestWithHistory,
    conversation_id: str,
    user: dict,
    user_time: datetime,
    user_model_config: Optional[ModelConfig] = None,
) -> AsyncGenerator[str, None]:
    """
    Execute agent in streaming mode for interactive chat.
    
    The executor agent:
    - Maintains conversation context
    - Analyzes user intent
    - Decides on tool usage
    - Delegates to subagents when needed
    - Coordinates multi-step workflows
    """
Responsibilities:
  • Conversation management
  • Intent understanding
  • Task coordination
  • Result synthesis
  • User communication
Available Tools:
  • Search & web browsing
  • Document generation
  • Todo management
  • Reminder creation
  • Workflow operations
  • Memory search
  • Support tickets
  • Weather info
  • Code execution

2. Provider Subagents

Specialized agents for deep integration expertise.
# Specialized for email operations

GMAIL_AGENT_CAPABILITIES = {
    "expertise": "Email composition, search, organization",
    "tools": [
        "GMAIL_SEND_MESSAGE",
        "GMAIL_SEARCH_MESSAGES", 
        "GMAIL_CREATE_LABEL",
        "GMAIL_MODIFY_MESSAGE",
        "GMAIL_CREATE_DRAFT",
        # ... 50+ more
    ],
    "special_knowledge": [
        "Email threading and conversation history",
        "Label hierarchy and organization",
        "Advanced search syntax",
        "Attachment handling",
        "HTML email composition"
    ]
}
When to Use:
  • Sending/receiving emails
  • Complex email searches
  • Email organization
  • Draft management

Agent Communication

Handoff Pattern

# From: apps/api/app/agents/core/subagents/handoff_tools.py

async def handoff(
    subagent_id: str,     # Which subagent to use
    task: str,            # What they should do
    config: RunnableConfig
) -> str:
    """
    Delegate a task to a specialized subagent.
    
    Example:
    handoff(
        subagent_id="gmail",
        task="Search for emails from [email protected] in last 7 days"
    )
    
    The subagent:
    1. Receives full context
    2. Uses its specialized tools
    3. Returns results to executor
    """

Communication Flow

1

User Request

“Find my client’s email and add their meeting to my calendar”
2

Executor Analysis

# Executor agent understands this requires:
# 1. Email search (Gmail subagent)
# 2. Calendar creation (Calendar subagent)

# Plan execution steps
steps = [
    {"action": "handoff", "subagent": "gmail", "task": "search"},
    {"action": "handoff", "subagent": "calendar", "task": "create_event"}
]
3

Gmail Subagent Execution

# Gmail agent receives task
await gmail_agent.execute(
    task="Search for emails from [email protected]"
)

# Uses specialized Gmail knowledge
# Returns: Email thread with meeting details
4

Calendar Subagent Execution

# Calendar agent receives task with context
await calendar_agent.execute(
    task="Create calendar event",
    context={
        "meeting_details": "from gmail agent results",
        "attendees": ["[email protected]"],
        "duration": "30 minutes"
    }
)

# Returns: Event created successfully
5

Executor Synthesis

# Executor combines results
response = """
I found the email from your client and created a calendar event:

📧 Email: "Proposal Discussion" from [email protected]
📅 Event: Tuesday, Dec 17 at 2:00 PM (30 min)
✅ Calendar invite sent to [email protected]
"""

Agent Graph Architecture

LangGraph Implementation

# From: apps/api/app/agents/core/graph_builder/build_graph.py

def create_agent(
    llm: LanguageModelLike,
    tool_registry: dict,
    agent_name: str,
    pre_model_hooks: List[Callable],
    end_graph_hooks: List[Callable]
):
    """
    Build an agent graph with:
    - Pre-processing nodes (message filtering, prompt management)
    - LLM execution node
    - Tool execution nodes
    - Post-processing nodes (memory learning, follow-up actions)
    
    Returns compiled StateGraph ready for execution.
    """

Node Pipeline

# From: apps/api/app/agents/core/subagents/base_subagent.py:88-94

# Every agent has this node pipeline:
pre_model_hooks = [
    filter_messages_node,         # Remove irrelevant messages
    manage_system_prompts_node,   # Inject memories & skills
    trim_messages_node,           # Keep context within limits
]

end_graph_hooks = [
    memory_learning_node,         # Learn from execution
    follow_up_actions_node,       # Suggest next steps (main agent only)
]

State Management

Agent State

# From: apps/api/app/agents/core/state.py

class State(BaseModel):
    query: str                        # User's current request
    messages: List[AnyMessage]        # Conversation history
    current_datetime: str             # For time-aware operations
    mem0_user_id: str                 # Memory namespace
    memories: List[str]               # Retrieved user memories
    memories_stored: bool             # Tracking flag
    conversation_id: str              # Thread identifier

Checkpointer System

# From: apps/api/app/agents/core/graph_builder/checkpointer_manager.py

class CheckpointerManager:
    """
    Manages PostgreSQL-backed state persistence.
    
    Enables:
    - Resume interrupted conversations
    - Time-travel debugging
    - Branching conversations
    - State recovery after errors
    """
    
    async def get_checkpointer(self):
        """Returns PostgreSQL checkpointer for state storage."""
Each agent execution creates checkpoints at every node transition. This allows precise state recovery and conversation branching.

Agent Configuration

Graph Manager

# From: apps/api/app/agents/core/graph_manager.py

class GraphManager:
    """Singleton manager for agent graphs."""
    
    _graphs: Dict[str, CompiledGraph] = {}
    
    @classmethod
    async def get_graph(cls, agent_name: str) -> CompiledGraph:
        """
        Get or create agent graph.
        
        Graphs are compiled once and reused for efficiency.
        Each user execution uses same graph with different config.
        """

Agent Config

# From: apps/api/app/helpers/agent_helpers.py

def build_agent_config(
    conversation_id: str,
    user: dict,
    user_time: datetime,
    user_model_config: Optional[ModelConfig],
    agent_name: str,
) -> dict:
    """
    Build configuration for agent execution.
    
    Config includes:
    - user_id: For memory and permissions
    - thread_id: For conversation tracking
    - user_time: For time-aware operations
    - user_timezone: For scheduling
    - model_config: LLM preferences
    - agent_name: Which agent is executing
    """

Tool Access Patterns

Tool Registry

# From: apps/api/app/agents/tools/core/registry.py:159-192

class ToolRegistry:
    """Central registry managing all agent tools."""
    
    def _add_category(
        self,
        name: str,              # "gmail"
        tools: List[BaseTool],
        space: str = "general", # Namespace for isolation
        require_integration: bool = False,
        is_delegated: bool = False  # Has subagent?
    ):
        """
        Register a category of tools.
        
        Delegated categories route to subagents.
        Non-delegated categories use direct tool execution.
        """

Tool Retrieval

# Non-delegated tools (todos, reminders, search)
# Executor agent has direct access

tools = [
    create_todo,
    list_todos,
    create_reminder,
    web_search_tool,
    generate_document
]

# Agent can call these directly without handoff

Agent Prompts

Specialized System Prompts

# From: apps/api/app/agents/prompts/

# Each subagent has custom prompts
PROMPTS = {
    "gmail_agent": {
        "system": "You are a Gmail expert. You understand email etiquette, ...",
        "skills": "Email composition patterns, threading, ...",
        "examples": "When user says 'email client', search for..."
    },
    
    "calendar_agent": {
        "system": "You are a scheduling expert. You understand availability, ...",
        "skills": "Time conflict resolution, recurring events, ...",
        "examples": "When user says 'next Tuesday', calculate..."
    },
    
    # ... prompts for each subagent
}

Dynamic Prompt Enhancement

# From: apps/api/app/agents/core/nodes/manage_system_prompts.py

async def manage_system_prompts_node(state, config):
    """
    Enhance base prompt with:
    
    1. User Context:
       - Retrieved memories
       - Current datetime
       - User timezone
    
    2. Agent Skills:
       - Learned procedures for this agent
       - Optimal approaches
       - Known gotchas
    
    3. Tool Availability:
       - Connected integrations
       - Available tool list
    """

Agent Execution Modes

Streaming Execution

# From: apps/api/app/agents/core/agent.py:117-161

async def call_agent(...) -> AsyncGenerator[str, None]:
    """
    Streaming mode for real-time user interactions.
    
    Yields:
    - Intermediate thinking
    - Tool call events
    - Tool results
    - Final response
    - Follow-up suggestions
    
    Used for: Chat interface, voice agents
    """

Silent Execution

# From: apps/api/app/agents/core/agent.py:163-193

async def call_agent_silent(...) -> tuple[str, dict]:
    """
    Silent mode for background processing.
    
    Returns:
    - Complete message
    - Tool usage data
    
    Used for: Workflows, scheduled tasks, webhooks
    """

Performance & Optimization

Parallel Tool Execution

# Agents can execute independent tools in parallel

# Sequential (slow)
gmail_result = await handoff("gmail", "search emails")
calendar_result = await handoff("calendar", "list events")

# Parallel (fast)
results = await asyncio.gather(
    handoff("gmail", "search emails"),
    handoff("calendar", "list events")
)

Caching

# Tool registry is cached
@cache
async def get_tool_registry() -> ToolRegistry:
    """Singleton registry - built once, used many times."""

# User capabilities are cached
@cache(ttl=300)  # 5 minutes
async def get_user_integration_capabilities(user_id: str):
    """Avoid repeated DB queries for same user."""

Token Optimization

# From: apps/api/app/agents/core/nodes/trim_messages_node.py

async def trim_messages_node(state, config):
    """
    Smart message trimming:
    - Keep system messages (always needed)
    - Keep recent messages (context)
    - Keep tool results (for coherence)
    - Summarize old history (reduce tokens)
    
    Target: Stay under model's context window
    """

Agent Evaluation

Testing Framework

# From: apps/api/app/agents/evals/

class AgentEvaluator:
    """
    Test agents against benchmark datasets.
    
    For each integration (Gmail, Slack, GitHub):
    - Load test scenarios
    - Execute agent
    - Compare output to expected results
    - Score: correctness, efficiency, safety
    """
    
    async def evaluate_agent(
        agent_name: str,
        test_dataset: str
    ) -> EvaluationReport:
        """Run full evaluation suite."""

Datasets

// From: apps/api/app/agents/evals/datasets/gmail.json

[
  {
    "scenario": "Send email to team",
    "user_input": "Email the team about tomorrow's meeting",
    "expected_tools": ["GMAIL_SEND_MESSAGE"],
    "expected_behavior": "Sends to team distribution list",
    "constraints": ["Must include meeting time", "Professional tone"]
  },
  // ... 50+ test scenarios per integration
]

Best Practices

Use Specific Subagents

Let executor delegate to subagents for provider operations. Don’t try to access integration tools directly.

Provide Context

When using handoff, include relevant context in the task description:
# Good
handoff("gmail", "Reply to email from boss about Q4 planning meeting")

# Bad
handoff("gmail", "Reply to email")

Trust Agent Intelligence

Agents are smart—don’t over-specify. Let them choose the best approach based on learned skills.

Monitor Learning

Check agent skill dashboards to see what procedures are being learned and validated.

Debugging Agents

Conversation Inspection

// From: apps/web/src/features/chat/

// Every conversation has full execution trace:
interface ConversationTrace {
  messages: Message[];
  tool_calls: ToolCall[];
  subagent_handoffs: Handoff[];
  state_checkpoints: Checkpoint[];
  memory_updates: MemoryUpdate[];
}

// Inspect exactly what happened at each step

Error Handling

# Agents have robust error handling

try:
    result = await handoff("gmail", task)
except ToolExecutionError as e:
    # Log error, inform user, suggest alternatives
    logger.error(f"Tool execution failed: {e}")
    return "I had trouble with Gmail. Would you like me to try another approach?"

except RateLimitError as e:
    # Automatic retry with backoff
    await asyncio.sleep(e.retry_after)
    result = await handoff("gmail", task)

Next Steps:

Build docs developers (and LLMs) love