Multi-Agent System

Overview

PentAGI employs a sophisticated multi-agent architecture where specialized AI agents collaborate to conduct penetration tests. Each agent has distinct capabilities, tool access, and reasoning patterns optimized for specific phases of security testing.

The multi-agent system is optional and can be disabled per assistant. When disabled, a single agent handles all operations directly.

Agent Roles

The system features specialized agents that work together in a coordinated workflow:

Researcher Agent

Purpose: Information gathering, reconnaissance, and vulnerability analysis Capabilities:

Web intelligence gathering through integrated browser and search APIs
Target enumeration and service discovery
Vulnerability database queries
OSINT (Open Source Intelligence) collection
Security advisory research

Tools:

Search engines (Tavily, Traversaal, Perplexity, DuckDuckGo, Google, Searxng)
Web scraper with isolated browser
Memory search for historical reconnaissance data
Knowledge graph queries for similar targets

Reasoning Pattern: Broad exploratory analysis focusing on gathering comprehensive information about targets before exploitation attempts.

Developer Agent

Purpose: Attack planning, payload development, and exploit adaptation Capabilities:

Exploit development and customization
Attack chain planning
Tool selection and configuration
Payload crafting for specific vulnerabilities
Technique adaptation based on target environment

Tools:

Memory search for successful exploit patterns
Knowledge graph queries for attack relationships
Access to exploit databases and tool documentation
Code generation capabilities for custom exploits

Reasoning Pattern: Strategic planning with emphasis on creating targeted, effective attack approaches based on researcher findings.

Executor Agent

Purpose: Command execution, tool operation, and result validation Capabilities:

Security tool execution (nmap, metasploit, sqlmap, etc.)
Command-line operations in sandboxed environment
Output analysis and validation
Result documentation
Error handling and retry logic

Tools:

20+ professional pentesting tools in sandboxed containers
Shell access for custom commands
Memory storage for execution results
Knowledge graph updates with findings

Reasoning Pattern: Precise execution with focus on command accuracy, output interpretation, and failure recovery.

Agent Coordination

Agents communicate through a structured delegation system:

Delegation Process

1. Task Analysis: The orchestrator analyzes the user request and current context to determine which agent is most appropriate. 2. Context Preparation: Relevant information from memory and knowledge graph is assembled for the specialized agent. 3. Agent Invocation: The selected agent receives:

Specific task description
Available tools and their schemas
Historical context from similar operations
Constraints and safety parameters

4. Execution: The agent performs its specialized function using available tools and reasoning capabilities. 5. Result Integration: Outputs are stored in memory/knowledge graph and returned to the orchestrator. 6. Continuation: The orchestrator decides whether to delegate further tasks or synthesize results.

Agent Configuration

Each agent can be configured with different LLM models optimized for their specific roles:

# Example agent configuration from provider YAML
researcher:
  model: "gpt-4.1"
  temperature: 0.7
  max_tokens: 4096
  reasoning: "extended"  # For complex research analysis

developer:
  model: "claude-sonnet-4"
  temperature: 0.3
  max_tokens: 8192
  reasoning: "standard"  # For strategic planning

executor:
  model: "gpt-4.1-mini"
  temperature: 0.1
  max_tokens: 2048
  reasoning: "none"  # Fast, deterministic execution

Model Selection Considerations

Researcher: Benefits from models with strong reasoning capabilities and broad knowledge (e.g., Claude Sonnet, GPT-4.1, Gemini Pro). Developer: Requires models with excellent code generation and strategic thinking (e.g., Claude Sonnet, DeepSeek Coder, GPT-4.1). Executor: Optimized for speed and accuracy; smaller models often sufficient (e.g., GPT-4.1-mini, Claude Haiku, Gemini Flash).

Tool Access Control

Agents have different tool permissions based on their roles:

{
  "search_web": "Search the internet for information",
  "search_memory": "Query historical research findings",
  "search_knowledge_graph": "Find related vulnerabilities and targets",
  "browse_url": "Visit and analyze web pages",
  "analyze_service": "Examine service configurations"
}

Agent Communication Patterns

Sequential Delegation

Most common pattern where agents work in sequence:

Researcher gathers target information
Developer creates attack plan based on findings
Executor runs the planned attacks
Orchestrator synthesizes results

Agents may be called multiple times with updated context:

Parallel Investigation

For complex targets, multiple researchers may investigate different aspects simultaneously (planned feature).

Memory and Context Management

Agent-Specific Memory

Each agent type can filter memory searches to retrieve relevant past experiences:

// From memory.go implementation
filters := map[string]any{
    "flow_id":  flowID,
    "doc_type": "memory",
    "task_id":  taskID,      // Optional: specific task context
    "subtask_id": subtaskID, // Optional: specific subtask context
}

docs, err := store.SimilaritySearch(
    ctx,
    query,
    maxResults,
    vectorstores.WithScoreThreshold(0.2),
    vectorstores.WithFilters(filters),
)

Context Windows

Agents have different context window requirements:

Researcher: Large context (64K-100K tokens) for comprehensive analysis
Developer: Medium context (32K-64K tokens) for code and plans
Executor: Smaller context (8K-32K tokens) for focused operations

Chain Summarization

To manage growing conversation histories, PentAGI implements intelligent summarization:

Preserves recent messages in full detail
Summarizes older messages while maintaining critical information
Keeps tool calls and responses intact for debugging
Configurable thresholds per agent type

See Memory System for more details.

Agent Performance Optimization

Tool Call ID Detection

The system automatically detects LLM provider-specific tool call ID patterns:

// From agents.go implementation
template, err := DetermineToolCallIDTemplate(
    ctx, provider, agentType, prompter,
)
// Examples:
// OpenAI: "call_{r:24:x}"
// Anthropic: "toolu_{r:24:b}"
// Gemini: "{f}:{r:1:d}"

This enables proper function call tracking and response correlation.

Retry Logic

Agents implement sophisticated retry mechanisms:

Function call failures trigger retries with error context
Model timeouts result in fallback models
Invalid responses prompt self-correction
Configurable max retries per operation

Barrier Functions

Certain tools act as “barrier functions” that require human approval:

Destructive operations (data deletion, service shutdown)
Potentially illegal actions (without explicit authorization)
Operations affecting production systems

Enabling/Disabling Agent Delegation

Agent delegation can be controlled at multiple levels:

Global Default

# In .env file
ASSISTANT_USE_AGENTS=false  # Disable by default
ASSISTANT_USE_AGENTS=true   # Enable by default

Per-Assistant Configuration

Users can toggle agent delegation in the UI when creating or editing assistants, overriding the global default.

When to Use Single vs Multi-Agent

Single Agent Mode (Faster):

Simple, straightforward tasks
Speed is critical
Minimal context switching needed
Direct command execution

Multi-Agent Mode (More Capable):

Complex penetration testing scenarios
Research-heavy operations
Multi-phase attack planning
Learning from diverse past experiences

Architecture - System-level agent coordination
Memory System - How agents store and retrieve context
Knowledge Graph - Semantic understanding across agents

Get Started

Core Concepts

Configuration

Deployment

Features

Development

Multi-Agent System

Overview

Agent Roles

Researcher Agent

Developer Agent

Executor Agent

Agent Coordination

Delegation Process

Agent Configuration

Model Selection Considerations

Tool Access Control

Agent Communication Patterns

Sequential Delegation

Iterative Refinement

Parallel Investigation

Memory and Context Management

Agent-Specific Memory

Context Windows

Chain Summarization

Agent Performance Optimization

Tool Call ID Detection

Retry Logic

Barrier Functions

Enabling/Disabling Agent Delegation

Global Default

Per-Assistant Configuration

When to Use Single vs Multi-Agent

Build docs developers (and LLMs) love

Get Started

Core Concepts

Configuration

Deployment

Features

Development

​Overview

​Agent Roles

​Researcher Agent

​Developer Agent

​Executor Agent

​Agent Coordination

​Delegation Process

​Agent Configuration

​Model Selection Considerations

​Tool Access Control

​Agent Communication Patterns

​Sequential Delegation

​Iterative Refinement

​Parallel Investigation

​Memory and Context Management

​Agent-Specific Memory

​Context Windows

​Chain Summarization

​Agent Performance Optimization

​Tool Call ID Detection

​Retry Logic

​Barrier Functions

​Enabling/Disabling Agent Delegation

​Global Default

​Per-Assistant Configuration

​When to Use Single vs Multi-Agent

​Related Concepts

Build docs developers (and LLMs) love

Overview

Agent Roles

Researcher Agent

Developer Agent

Executor Agent

Agent Coordination

Delegation Process

Agent Configuration

Model Selection Considerations

Tool Access Control

Agent Communication Patterns

Sequential Delegation

Iterative Refinement

Parallel Investigation

Memory and Context Management

Agent-Specific Memory

Context Windows

Chain Summarization

Agent Performance Optimization

Tool Call ID Detection

Retry Logic

Barrier Functions

Enabling/Disabling Agent Delegation

Global Default

Per-Assistant Configuration

When to Use Single vs Multi-Agent

Related Concepts