Skip to main content

Overview

Every Shannon pentest follows a consistent five-phase workflow designed to mirror how a professional penetration tester approaches security assessment. Each phase builds on the previous phase’s deliverables, creating a progressive analysis pipeline.

Phase Diagram

Phase 1: Pre-Reconnaissance

Agent: pre-recon
Model Tier: Large (Claude Opus)
Deliverable: code_analysis_deliverable.md

Purpose

Build a comprehensive technical foundation by analyzing the application’s source code and running external reconnaissance tools.

Activities

The pre-recon agent performs deep static analysis:
  • Technology Stack Detection: Identifies frameworks, libraries, languages, and build tools
  • Architecture Mapping: Understands application structure, entry points, and data flow
  • Endpoint Discovery: Catalogs all API routes, controllers, and handlers
  • Security Control Analysis: Identifies authentication, authorization, and validation mechanisms
  • Dependency Analysis: Examines third-party libraries for known vulnerabilities
  • Configuration Review: Analyzes environment variables, config files, and security settings
// Example agent configuration from src/session-manager.ts:15-22
'pre-recon': {
  name: 'pre-recon',
  displayName: 'Pre-recon agent',
  prerequisites: [],
  promptTemplate: 'pre-recon-code',
  deliverableFilename: 'code_analysis_deliverable.md',
  modelTier: 'large',
}

Duration

Typically 10-15 minutes depending on codebase size and complexity.

Phase 2: Reconnaissance

Agent: recon
Prerequisites: pre-recon
Deliverable: recon_deliverable.md

Purpose

Perform live application exploration using the pre-recon intelligence as a guide. Map the actual attack surface by interacting with the running application.

Activities

Using Playwright MCP, the recon agent:
  • Navigates the application like a real user
  • Tests authentication flows (login, registration, password reset)
  • Maps all accessible pages and forms
  • Identifies client-side validation logic
  • Captures screenshots for documentation
  • Records network requests and responses
Validates findings from source code analysis:
  • Confirms which API endpoints are actually exposed
  • Tests endpoint authentication requirements
  • Identifies undocumented endpoints
  • Maps parameter requirements
  • Tests rate limiting and access controls
If authentication config is provided:
  • Executes login flow (form, SSO, API, basic auth)
  • Handles 2FA/TOTP if configured
  • Tests session management
  • Identifies privilege levels (user, admin, etc.)
  • Maps authenticated vs. unauthenticated access
Config Example:
authentication:
  login_type: form
  login_url: "https://app.example.com/login"
  credentials:
    username: "[email protected]"
    password: "P@ssw0rd123"
    totp_secret: "LB2E2RX7XFHSTGCK"
  login_flow:
    - "Type $username into the email field"
    - "Type $password into the password field"
    - "Click the 'Sign In' button"
Creates a comprehensive attack surface map:
  • All discovered endpoints with HTTP methods
  • Form inputs and their validation
  • File upload endpoints
  • API parameter structures
  • Cookies, headers, and tokens
  • Error messages and stack traces

Duration

Typically 15-20 minutes depending on application complexity and authentication requirements.

Phase 3: Vulnerability Analysis

Agents: 5 parallel specialized agents
Prerequisites: recon
Deliverables: 5 analysis files + exploitation queues

Purpose

Identify potential vulnerabilities through structured analysis. Each agent specializes in a specific OWASP vulnerability class.

Parallel Execution

All 5 vulnerability analysis agents run concurrently in parallel to maximize speed. Each has its own isolated Playwright browser instance to prevent conflicts.
// From src/temporal/workflows.ts:380-387
state.currentPhase = 'vulnerability-exploitation';
state.currentAgent = 'pipelines';
await a.logPhaseTransition(activityInput, 'vulnerability-exploitation', 'start');

const maxConcurrent = input.pipelineConfig?.max_concurrent_pipelines ?? 5;
const pipelineResults = await runWithConcurrencyLimit(pipelineThunks, maxConcurrent);

Vulnerability Agents

Focus: SQL Injection, Command Injection, NoSQL InjectionAnalysis Method:
  • Data flow analysis from user inputs to database queries
  • Command execution sink identification
  • ORM/query builder usage analysis
  • Input sanitization effectiveness
Deliverable: injection_analysis_deliverable.mdExploitation Queue: JSON file with:
{
  "vulnerabilities": [
    {
      "id": "INJ-001",
      "type": "SQL Injection",
      "endpoint": "/api/search",
      "parameter": "query",
      "description": "User input concatenated directly into SQL query",
      "code_location": "src/api/search.js:45",
      "payload_suggestion": "' OR '1'='1",
      "confidence": "high"
    }
  ]
}

Concurrency Control

You can limit parallel execution to reduce API usage bursts:
# In config YAML
pipeline:
  max_concurrent_pipelines: 2  # Run 2 of 5 pipelines at a time

Duration

Typically 30-45 minutes total with all 5 agents running in parallel.

Phase 4: Exploitation

Agents: 5 conditional parallel agents
Prerequisites: Corresponding vuln analysis agent
Deliverables: 5 exploitation evidence files (if vulnerabilities found)

Purpose

Execute real-world attacks to prove that hypothesized vulnerabilities are actually exploitable. Only findings that can be successfully exploited are reported.

Conditional Execution

Exploitation agents only run if their corresponding vulnerability analysis found actionable issues.
// From src/temporal/workflows.ts:389-416
async function runVulnExploitPipeline(
  vulnType: VulnType,
  runVulnAgent: () => Promise<AgentMetrics>,
  runExploitAgent: () => Promise<AgentMetrics>
): Promise<VulnExploitPipelineResult> {
  // 1. Run vulnerability analysis
  let vulnMetrics = await runVulnAgent();
  
  // 2. Check exploitation queue for actionable findings
  const decision = await a.checkExploitationQueue(activityInput, vulnType);
  
  // 3. Conditionally run exploitation agent
  let exploitMetrics: AgentMetrics | null = null;
  if (decision.shouldExploit) {
    exploitMetrics = await runExploitAgent();
  }
  
  return { vulnType, vulnMetrics, exploitMetrics };
}

Exploitation Agents

Agent: injection-exploitExploitation Techniques:
  • SQL Injection with UNION queries
  • Blind SQL injection (time-based, boolean-based)
  • Command injection with shell metacharacters
  • NoSQL injection operators
Evidence Collection:
  • Database content extraction
  • Command execution output
  • Error messages revealing database structure
  • Screenshots of successful exploits
Example Evidence:Successful SQL injection proof-of-concept extracting all usernames and password hashes from the database via UNION SELECT attack on POST /api/search endpoint.

No Exploit, No Report

If an exploitation agent cannot successfully prove a vulnerability through actual exploitation, that finding is discarded as a false positive. Only proven exploits make it to the final report.

Duration

Typically 30-60 minutes total with all 5 agents running in parallel (only for vulnerability types with findings).

Phase 5: Reporting

Agent: report
Model Tier: Small (Claude Haiku)
Prerequisites: All 5 exploit agents
Deliverable: comprehensive_security_assessment_report.md

Purpose

Consolidate all validated findings into a professional, actionable penetration test report.

Report Generation Process

1

Evidence Assembly

Concatenate all exploitation evidence files into a single document:
  • injection_exploitation_evidence.md
  • xss_exploitation_evidence.md
  • auth_exploitation_evidence.md
  • authz_exploitation_evidence.md
  • ssrf_exploitation_evidence.md
2

Report Generation

The report agent:
  • Adds executive summary
  • Categorizes findings by severity (Critical, High, Medium, Low)
  • Removes hallucinated or unverified content
  • Formats for readability
  • Adds remediation recommendations
3

Metadata Injection

Final report includes:
  • Model information (which Claude model was used)
  • Timestamp and duration
  • Configuration used
  • Target application details

Report Structure

# Comprehensive Security Assessment Report

## Executive Summary

**Target Application**: https://app.example.com  
**Assessment Date**: 2025-03-05  
**Total Vulnerabilities Found**: 12  
- Critical: 3
- High: 5
- Medium: 3
- Low: 1

## Critical Findings

### 1. SQL Injection in Search Endpoint

**Severity**: Critical  
**CVSS Score**: 9.8  
**Endpoint**: POST /api/search

**Description**:
The search functionality is vulnerable to SQL injection due to unsafe string concatenation...

**Proof of Concept**:
```bash
curl -X POST https://app.example.com/api/search \
  -d '{"query": "' UNION SELECT * FROM users--"}'
Impact:
  • Complete database compromise
  • User credential theft
  • Data exfiltration
Remediation:
  1. Use parameterized queries exclusively
  2. Implement input validation
  3. Apply principle of least privilege to database user

Recommendations

  1. Immediate Actions (Critical/High severity)
  2. Short-term Improvements (Medium severity)
  3. Long-term Hardening (Low severity + general improvements)

Methodology

This assessment was conducted using Shannon, an autonomous AI penetration tester…

### Duration

Typically **5-10 minutes**.

---

## Phase Timeline

Typical pentest duration breakdown:

```mermaid
gantt
    title Shannon Pentest Timeline
    dateFormat HH:mm
    axisFormat %H:%M
    
    section Sequential Phases
    Pre-Recon           :p1, 00:00, 15min
    Recon              :p2, after p1, 20min
    
    section Parallel Phases
    Vuln Analysis (5x)  :p3, after p2, 40min
    Exploitation (5x)   :p4, after p3, 50min
    
    section Final Phase
    Reporting          :p5, after p4, 10min
Total Duration: ~1.5 to 2 hours for a typical application
Duration varies based on:
  • Application complexity
  • Codebase size
  • Number of vulnerabilities found
  • Concurrency settings (max_concurrent_pipelines)
  • Model tier selection

Phase Dependencies

// From src/session-manager.ts - Agent prerequisites
export const AGENTS: Readonly<Record<AgentName, AgentDefinition>> = {
  'pre-recon': {
    prerequisites: [],  // Runs first
  },
  'recon': {
    prerequisites: ['pre-recon'],  // Waits for pre-recon
  },
  'injection-vuln': {
    prerequisites: ['recon'],  // All vuln agents wait for recon
  },
  'injection-exploit': {
    prerequisites: ['injection-vuln'],  // Each exploit waits for its vuln
  },
  'report': {
    prerequisites: [
      'injection-exploit',
      'xss-exploit', 
      'auth-exploit',
      'ssrf-exploit',
      'authz-exploit'
    ],  // Waits for all exploits
  },
};

Next Steps

Agent System

Learn how specialized agents are defined and executed

Temporal Orchestration

Understand how workflows handle crashes and resume

Interpreting Reports

Guide to understanding Shannon’s pentest reports

Cost Optimization

Tips for reducing pentest costs and duration

Build docs developers (and LLMs) love