Multi-Agent Orchestration

Overview

RAPTOR uses multi-agent orchestration to coordinate specialized AI agents that work together on complex security analysis tasks. Each agent has a specific role, skills, and tools.

17 Specialized Agents

Purpose-built agents for evidence collection, analysis, verification, and reporting

Sequential & Parallel Execution

Run agents in parallel for efficiency or sequentially when dependencies exist

Agent Communication

Agents communicate through shared files and structured data formats

Workflow Orchestration

Main orchestrator manages agent lifecycle and coordinates complex workflows

Architecture

Orchestrator Responsibilities

Agent Spawning

The orchestrator is the ONLY component that spawns agents. Agents never spawn other agents.

Workflow Coordination

Manages phase transitions and ensures agents run in correct order with proper dependencies.

Data Passing

Provides working directory path to all agents for shared file access.

Error Handling

Catches agent failures, implements retry logic, and handles graceful degradation.

Agent Types

Evidence Collection Agents

These agents collect forensic evidence from different sources and write to the shared evidence store.

oss-investigator-gh-archive-agent

Purpose: Query GitHub Archive via BigQuery for tamper-proof event dataSkills: github-archive, github-evidence-kitCollects:

PushEvent (commits pushed)
PullRequestEvent (PRs opened/closed/merged)
IssuesEvent (issues opened/closed)
CreateEvent/DeleteEvent (branches/tags created/deleted)
WorkflowRunEvent (GitHub Actions runs)

Output: Writes events to evidence.jsonInvocation:

Task: oss-investigator-gh-archive-agent
  Prompt: "Collect evidence from GH Archive for <research question>.
           Working directory: <workdir>
           Targets: repos=<repos>, actors=<actors>, dates=<dates>"

oss-investigator-github-agent

Purpose: Query live GitHub API for current repository stateSkills: github-evidence-kitCollects:

Commit content and metadata
File contents at specific refs
Branch and tag information
PR and issue details (if not deleted)
Fork relationships

Output: Writes observations to evidence.jsonUse Case: Retrieve commit content after getting SHA from GH Archive

oss-investigator-wayback-agent

Purpose: Recover deleted content from Wayback MachineSkills: github-wayback-recovery, github-evidence-kitCollects:

Archived README and documentation
Deleted issue/PR content
Repository metadata snapshots
Wiki pages

Output: Writes snapshots to evidence.jsonLimitation: Cannot recover private content or full git history

oss-investigator-local-git-agent

Purpose: Analyze cloned repositories for dangling commitsSkills: github-evidence-kitCollects:

Dangling commits (not reachable from any ref)
Reflog entries
Force-pushed commit content

Output: Writes commit observations to evidence.jsonValue: Reveals force-pushed or deleted commits that attackers tried to hide

oss-investigator-ioc-extractor-agent

Purpose: Extract IOCs (Indicators of Compromise) from vendor security reportsSkills: github-evidence-kitCollects:

Commit SHAs
File hashes
Usernames
Repository URLs
IP addresses and domains

Output: Writes IOC observations to evidence.jsonInvocation: Only spawned if vendor report URL is in research question

Analysis Agents

These agents analyze collected evidence and form hypotheses about security incidents.

oss-hypothesis-former-agent

Purpose: Analyze evidence and form hypotheses OR request additional evidenceSkills: github-evidence-kitInput: Reads evidence.json, previous rebuttal (if retry)Output: Writes either:

evidence-request-{N}.md if more evidence needed
hypothesis-{N}.md if evidence sufficient

Key Behavior: Can request specific follow-up investigations:

# Evidence Request 001

## Missing Evidence
- **Need**: PushEvents for actor 'user' on 2025-07-13
- **Agent**: oss-investigator-gh-archive-agent
- **Query**: "Query PushEvents where actor.login='user'"

## Reason
Cannot determine timeline without push events.

Hypothesis Format:

Research question restatement
Executive summary
Timeline table with evidence citations
Attribution with confidence levels
Intent analysis (evidence-based)
Impact assessment
Evidence citations table

oss-hypothesis-checker-agent

Purpose: Validate hypothesis claims against verified evidenceSkills: github-evidence-kitInput: Reads hypothesis-{N}.md and evidence.jsonOutput: Writes either:

hypothesis-{N}-confirmed.md if all claims validated
hypothesis-{N}-rebuttal.md if claims fail verification

Validation Rules:

Every claim must cite evidence by ID
Evidence IDs must exist in store with verified status
Attribution confidence must match evidence quality
Timeline must use exact UTC timestamps from evidence
No speculation or unsupported claims

Rebuttal Format:

# Hypothesis Rebuttal

## Issues Found
1. Claim "attacker created tag at 19:41" has no evidence citation
2. Evidence [EVD-003] cited but not in evidence store
3. Attribution confidence "HIGH" but only one source

## Required Fixes
- Add evidence citation for tag creation claim
- Remove or verify [EVD-003]
- Reduce confidence to MEDIUM or add corroborating source

crash-analyzer-agent

Purpose: Deep root-cause analysis of C/C++ crashes using rr debuggerSkills: rr-debugger, function-tracing, gcov-coverageInput: Bug tracker URL, git repo URLOutput: Root cause analysis with:

Crash location and stack trace
Function call sequence leading to crash
Code coverage data
Memory state at crash point

Workflow: Deterministic replay debugging with rr

crash-analyzer-checker-agent

Purpose: Validate crash analysis accuracyInput: Crash analysis reportOutput: Verification report with:

Claim validation results
Code path verification
Alternative explanations (if any)

Method: Re-executes traces and checks coverage data

exploitability-validator-agent

Purpose: Determine if vulnerabilities are exploitableSkills: exploitability-validationInput: SARIF findings, target binary pathOutput: Validation report with:

Reachability analysis
Exploit feasibility assessment
Mitigation effectiveness check

Stages: 0 (Inventory) → A (One-Shot) → B (Process) → C (Sanity) → D (Ruling) → E (Feasibility)

Verification Agents

oss-evidence-verifier-agent

Purpose: Re-verify all collected evidence against original sourcesSkills: github-evidence-kitInput: Reads evidence.jsonOutput: Writes evidence-verification-report.mdVerification Process:

from src import EvidenceStore
from src.verifiers import ConsistencyVerifier

store = EvidenceStore.load(f"{workdir}/evidence.json")
verifier = ConsistencyVerifier()

results = verifier.verify_all(store.get_all())

# Report includes:
# - Total evidence count
# - Verified count
# - Failed verifications with reasons
# - Unverifiable evidence (source unavailable)

Verification Methods:

GH Archive: Re-query BigQuery with same parameters
GitHub API: Re-fetch from API endpoints
Wayback: Re-check snapshot availability
Local Git: Re-validate commit existence

Reporting Agents

oss-report-generator-agent

Purpose: Generate final forensic report from confirmed hypothesisSkills: github-evidence-kitInput: Reads:

hypothesis-{N}-confirmed.md
evidence.json
evidence-verification-report.md

Output: Writes forensic-report.mdReport Sections:

Executive Summary
Timeline (chronological with evidence)
Attribution (actors, confidence, evidence)
Intent Analysis
Impact Assessment
IOCs (Indicators of Compromise)
Evidence Appendix (full details)

Execution Modes

Parallel Execution

Use When: Agents have no dependencies on each other’s outputs Pattern: Spawn multiple agents in a single message

# CORRECT: Single message with multiple Task calls
Task: oss-investigator-gh-archive-agent
  Prompt: "Collect from GH Archive..."
  
Task: oss-investigator-github-agent
  Prompt: "Collect from GitHub API..."
  
Task: oss-investigator-wayback-agent
  Prompt: "Collect from Wayback..."
  
Task: oss-investigator-local-git-agent
  Prompt: "Analyze local repo..."

Benefit: 4x faster than sequential execution for evidence collection phase

Common Mistake: Spawning agents in separate messages

# WRONG: Separate messages (runs sequentially)
message 1: spawn oss-investigator-gh-archive-agent
message 2: spawn oss-investigator-github-agent
message 3: spawn oss-investigator-wayback-agent

This defeats the purpose of parallelization!

Sequential Execution

Use When: Agent B depends on Agent A’s output Pattern: Wait for completion before next spawn

# Phase 3: Hypothesis Formation
result1 = spawn_agent("oss-hypothesis-former-agent")
# Wait for completion and check output

if evidence_request_exists:
    # Phase 3b: Follow-up Evidence Collection
    result2 = spawn_agent("oss-investigator-gh-archive-agent", query=request)
    # Wait for completion
    
    # Phase 3c: Retry Hypothesis Formation
    result3 = spawn_agent("oss-hypothesis-former-agent")

Dependencies Example:

Agent Communication

Shared File System

Agents communicate through files in the working directory:

.out/oss-forensics-20250713-143022/
├── evidence.json              # Shared evidence store
├── evidence-request-001.md    # Hypothesis former → Orchestrator
├── hypothesis-001.md          # Hypothesis former → Checker
├── hypothesis-001-rebuttal.md # Checker → Hypothesis former
├── hypothesis-002.md          # Revised hypothesis
├── hypothesis-002-confirmed.md # Checker → Report generator
├── evidence-verification-report.md # Verifier → Report generator
└── forensic-report.md         # Final output

Data Formats

Evidence Store (JSON):

{
  "evidence": [
    {
      "evidence_id": "evt-001",
      "type": "PushEvent",
      "observed_when": "2025-07-13T20:30:24Z",
      "observed_by": "gharchive",
      "observed_what": "Commit pushed to main branch",
      "verification": {
        "source": "gharchive",
        "url": "bigquery://githubarchive.day.20250713",
        "verified_at": "2025-07-14T10:15:33Z"
      },
      "payload": { /* full event data */ }
    }
  ]
}

Agent Communication Pattern:

Write

Agent A writes structured data to shared file

Signal

Agent A completes and returns to orchestrator

Orchestrator Checks

Orchestrator checks for expected output files

Read

Agent B reads shared file created by Agent A

Error Handling

Agent Failure Strategies

Retry
Fallback
Graceful Degradation
Fail Fast

When: Transient errors (network, API rate limits)

max_retries = 3
for attempt in range(max_retries):
    try:
        result = spawn_agent("oss-investigator-github-agent", ...)
        break
    except Exception as e:
        if attempt < max_retries - 1:
            time.sleep(2 ** attempt)  # Exponential backoff
        else:
            logger.error(f"Agent failed after {max_retries} attempts")

When: Primary data source unavailable, alternative exists

try:
    # Try GitHub API first
    result = spawn_agent("oss-investigator-github-agent", ...)
except Exception:
    # Fall back to Wayback Machine
    result = spawn_agent("oss-investigator-wayback-agent", ...)

When: Non-critical agent fails

try:
    # Try to get IOCs from vendor report
    spawn_agent("oss-investigator-ioc-extractor-agent", ...)
except Exception as e:
    logger.warning(f"IOC extraction failed: {e}")
    # Continue investigation without IOCs
    pass

When: Critical prerequisite missing (e.g., BigQuery credentials)

if not os.getenv("GOOGLE_APPLICATION_CREDENTIALS"):
    raise RuntimeError(
        "BigQuery credentials required. "
        "Set GOOGLE_APPLICATION_CREDENTIALS environment variable."
    )

Error Recovery Workflow

def orchestrate_with_recovery():
    try:
        # Phase 2: Parallel evidence collection
        results = spawn_parallel_agents([
            "oss-investigator-gh-archive-agent",
            "oss-investigator-github-agent",
            "oss-investigator-wayback-agent",
        ])
        
        # Check which agents succeeded
        successful = [r for r in results if r.success]
        failed = [r for r in results if not r.success]
        
        if len(successful) == 0:
            raise RuntimeError("All evidence collection failed")
        
        if failed:
            logger.warning(f"{len(failed)} agents failed, continuing with available evidence")
        
        # Phase 3: Hypothesis formation (proceeds with available evidence)
        spawn_agent("oss-hypothesis-former-agent", ...)
        
    except Exception as e:
        logger.error(f"Orchestration failed: {e}")
        generate_partial_report()

Best Practices

Spawn Parallel When Possible

Evidence collectors have no dependencies - always spawn in parallel for 4-5x speedup

Single Responsibility

Each agent does ONE thing well. Don’t ask evidence collectors to form hypotheses.

Pass Working Directory

Every agent needs the working directory path to read/write shared files

Verify Outputs

Orchestrator should check that expected output files exist before proceeding

Agent Design Principles

Agents Never Spawn Other Agents

Only the orchestrator spawns agents. This prevents infinite loops and makes workflows debuggable.

Skills Define Capabilities

Agents declare their skills (e.g., github-archive, github-evidence-kit). Skills are documentation loaded into agent context.

Tools Define Actions

Agents declare their tools (e.g., Bash, Read, Write). Tools are actual capabilities the agent can use.

File-Based Communication

Agents communicate through files, not return values. This makes workflows resumable and debuggable.

Performance Optimization

Parallelization Impact

Evidence Collection Phase:

Execution Mode	Time	Speedup
Sequential (4 agents)	120s	1x
Parallel (4 agents)	30s	4x

Why Parallel Works:

GH Archive agent: Waits for BigQuery
GitHub API agent: Waits for API responses
Wayback agent: Waits for Archive.org
Local Git agent: Waits for git commands

All are I/O-bound, not CPU-bound!

Orchestrator Optimization

# ❌ BAD: Spawn agents one at a time
for agent in ["agent1", "agent2", "agent3"]:
    spawn_agent(agent)
    # Total time: 30s + 30s + 30s = 90s

# ✅ GOOD: Spawn all agents in parallel
spawn_parallel_agents(["agent1", "agent2", "agent3"])
# Total time: max(30s, 30s, 30s) = 30s

Debugging

Workflow Tracing

Enable detailed orchestrator logging:

export RAPTOR_LOG_LEVEL=DEBUG
/oss-forensics "your question"

Log Output:

[ORCHESTRATOR] Phase 0: Initialize → workdir created
[ORCHESTRATOR] Phase 1: Parse prompt → repo=aws/aws-toolkit-vscode, actor=lkmanka58
[ORCHESTRATOR] Phase 2: Spawn 4 investigators in parallel
[AGENT] oss-investigator-gh-archive-agent → started
[AGENT] oss-investigator-github-agent → started
[AGENT] oss-investigator-wayback-agent → started
[AGENT] oss-investigator-local-git-agent → started
[AGENT] oss-investigator-gh-archive-agent → completed (42 events)
[AGENT] oss-investigator-github-agent → completed (5 commits)
[AGENT] oss-investigator-wayback-agent → failed (no snapshots)
[AGENT] oss-investigator-local-git-agent → completed (2 dangling commits)
[ORCHESTRATOR] Phase 3: Spawn hypothesis former

Common Issues

Agent spawns but produces no output

error

Cause: Agent completed but didn’t write expected output fileDebug:

ls -la .out/oss-forensics-*/
# Check if evidence.json, hypothesis-*.md, etc. exist

Fix: Check agent logs for errors during file write

Parallel agents run sequentially

warning

Cause: Agents spawned in separate messages instead of single messageFix: Use single message with multiple Task calls

Evidence store grows unbounded

info

Expected behavior - evidence accumulates across investigationIf concern: Evidence is deduplicated by ID, so duplicates don’t inflate size

Agent Definitions

Full agent specifications with skills and tools

Orchestrator Code

Source code for workflow orchestration logic

Evidence Kit API

Python API for evidence collection and storage

Creating Custom Agents

Guide to building your own specialized agents

Get Started

Core Concepts

Security Testing

Analysis & Exploitation

Advanced Features

Guides

​Overview

17 Specialized Agents

Sequential & Parallel Execution

Agent Communication

Workflow Orchestration

​Architecture

​Orchestrator Responsibilities

​Agent Types

​Evidence Collection Agents

​Analysis Agents

​Verification Agents

​Reporting Agents

​Execution Modes

​Parallel Execution

​Sequential Execution

​Agent Communication

​Shared File System

​Data Formats

​Error Handling

​Agent Failure Strategies

​Error Recovery Workflow

​Best Practices

Spawn Parallel When Possible

Single Responsibility

Pass Working Directory

Verify Outputs

​Agent Design Principles

​Performance Optimization

​Parallelization Impact

​Orchestrator Optimization

​Debugging

​Workflow Tracing

​Common Issues

​Further Reading

Agent Definitions

Orchestrator Code

Evidence Kit API

Creating Custom Agents

Build docs developers (and LLMs) love

Overview

Architecture

Orchestrator Responsibilities

Agent Types

Evidence Collection Agents

Analysis Agents

Verification Agents

Reporting Agents

Execution Modes

Parallel Execution

Sequential Execution

Agent Communication

Shared File System

Data Formats

Error Handling

Agent Failure Strategies

Error Recovery Workflow

Best Practices

Agent Design Principles

Performance Optimization

Parallelization Impact

Orchestrator Optimization

Debugging

Workflow Tracing

Common Issues

Further Reading