Skip to main content
The agent system is the core of Esprit’s autonomous security testing capabilities. It combines LLM-powered decision-making with structured tool execution to perform comprehensive security assessments.

Agent Hierarchy

BaseAgent Class

All agents inherit from BaseAgent (located in esprit/agents/base_agent.py), which provides:

Core Capabilities

  • Agent Loop: Main execution loop that processes tasks until completion
  • LLM Integration: Communicates with language models for decision-making
  • Tool Execution: Routes tool calls to appropriate executors
  • State Management: Maintains conversation history and execution state
  • Error Handling: Automatic retry logic for transient failures
  • Sandbox Coordination: Manages sandbox lifecycle and communication

Configuration

config = {
    "llm_config": LLMConfig(...),
    "max_iterations": 300,
    "non_interactive": False,
    "local_sources": [...],
    "llm_auto_resume_max_attempts": 2,
    "llm_auto_resume_cooldown_seconds": 12.0,
}
  • llm_config: LLM provider settings (model, API key, skills)
  • max_iterations: Maximum agent loop iterations before forced stop (default: 300)
  • non_interactive: If true, agent stops on errors instead of waiting for user input
  • local_sources: List of local directories to copy into sandbox
  • llm_auto_resume_max_attempts: Number of automatic retries for transient LLM failures
  • llm_auto_resume_cooldown_seconds: Delay between retry attempts
  • non_interactive_wait_timeout_seconds: How long to wait before auto-resuming in non-interactive mode

The Agent Loop

The agent loop (agent_loop method in base_agent.py:215) is the heart of agent execution:

Loop Phases

await self._initialize_sandbox_and_state(task)
  • Creates sandbox environment (Docker or Cloud)
  • Copies local sources to /workspace
  • Adds initial task message to conversation
  • Handles SandboxInitializationError gracefully
self._check_agent_messages(self.state)
  • Checks for messages from other agents or user
  • Resumes from waiting state if new messages arrive
  • Supports inter-agent communication
async for response in self.llm.generate(
    self.state.get_conversation_history(), 
    tools=tools
):
    # Stream LLM response
    # Extract tool invocations
    # Execute tools via tool executor
Key behaviors:
  • Rejects empty responses (lines 585-597)
  • Supports native tool calling (OpenAI format)
  • Adds messages with thinking_blocks (extended thinking)
  • Updates telemetry tracer with streaming content
should_agent_finish = await process_tool_invocations(
    actions, conversation_history, self.state
)
  • Executes tools in sandbox via HTTP API
  • Returns True if finish_scan or agent_finish called
  • Handles cancellation gracefully
  • Updates conversation with tool results
Warnings at iteration thresholds:
  • 80% (240/300): “Approaching maximum iteration limit”
  • 297/300: “CRITICAL: You have only 3 iterations left!”
Forces explicit finish tool usage to prevent infinite loops.

EspritAgent

The main security testing agent (esprit/agents/EspritAgent/esprit_agent.py):
class EspritAgent(BaseAgent):
    max_iterations = 300
    
    def __init__(self, config: dict[str, Any]):
        # Root agents get "root_agent" skill automatically
        default_skills = []
        if state.parent_id is None:
            default_skills = ["root_agent"]
        
        self.default_llm_config = LLMConfig(skills=default_skills)
        super().__init__(config)

Root Agent Skill

Root agents receive special instructions via the root_agent skill (system prompt injection) that includes:
  • Security testing methodology
  • Target analysis guidelines
  • Sub-agent creation strategies
  • Reporting requirements

Multi-Agent Orchestration

Agent Graph

Agents are tracked in a global graph structure (esprit/tools/agents_graph/agents_graph_actions.py):
_agent_graph = {
    "nodes": {  # agent_id -> agent metadata
        "agent-123": {
            "id": "agent-123",
            "name": "Root Agent",
            "task": "Security scan of example.com",
            "status": "running",
            "parent_id": None,
            "llm_config": "default",
            "agent_type": "EspritAgent",
        }
    },
    "edges": [  # parent -> child relationships
        {"from": "agent-123", "to": "agent-456", "type": "delegation"}
    ]
}

Agent Communication

Agents communicate via message passing (base_agent.py:721-793):
# Agent receives inter-agent message
message_content = f"""<inter_agent_message>
    <sender>
        <agent_name>{sender_name}</agent_name>
        <agent_id>{sender_id}</agent_id>
    </sender>
    <message_metadata>
        <type>{message_type}</type>
        <priority>{priority}</priority>
    </message_metadata>
    <content>
{message_content}
    </content>
</inter_agent_message>"""

Agent Lifecycle

  1. Creation: Agent added to graph with “running” status
  2. Execution: Processes iterations, executes tools
  3. Waiting: Pauses when waiting for input or sub-agent completion
  4. Stopping: User cancellation or error
  5. Completed: Finish tool called successfully
  6. Failed: Unrecoverable error occurred

State Management

Agent state (esprit/agents/state.py) tracks:
  • Conversation history: All messages between user, assistant, and tools
  • Actions history: All tool invocations
  • Sandbox info: Workspace ID, auth token, API URL
  • Status flags: is_waiting, llm_failed, should_stop
  • Iteration count: Current iteration number
  • Final result: Scan results when completed

Error Handling

  • Retryable errors (HTTP 429, 500, 502, 503, 504): Auto-retry with cooldown
  • Non-retryable errors: Enter waiting state, require user intervention
  • Extracts HTTP status codes from error messages for intelligent retry logic
  • Enters waiting state with “sandbox_failed” status
  • Logs detailed error information to telemetry
  • In non-interactive mode, returns error result immediately
  • Catches RuntimeError, ValueError, TypeError
  • Logs full stack trace
  • Enters waiting state for user troubleshooting
  • Supports graceful cancellation via cancel_current_execution()
  • Partial LLM responses marked as “[ABORTED BY USER]”
  • Tool execution tasks cancelled cleanly

Non-Interactive Mode

When non_interactive=True:
  • Agents terminate immediately on errors instead of waiting
  • Root agents auto-resume if no messages after 90 seconds (configurable)
  • Maximum 4 auto-resume attempts before scan marked as stalled
  • Prevents deadlocks in CI/CD pipelines

Next Steps

Tools

Explore available tools for agents

Docker Sandbox

Learn about sandbox environments

Build docs developers (and LLMs) love