Skip to main content

Overview

The Exploitability Validation package provides a rigorous multi-stage pipeline for validating that vulnerability findings from static analysis are not just theoretical issues, but real, reachable, and potentially exploitable vulnerabilities.

Purpose

Validate findings through stages:
  • Stage 0 (Inventory): Catalog code structure and entry points
  • Stage A (One-Shot): Quick exploitability assessment
  • Stage B (Process): Attack path validation
  • Stage C (Sanity): False positive elimination
  • Stage D (Ruling): Hypothesis validation
  • Stage E (Feasibility): Binary exploit feasibility (memory corruption only)

Architecture

packages/exploitability_validation/
├── orchestrator.py         # Pipeline orchestration
├── checklist_builder.py    # Stage 0: Code inventory
├── agentic.py              # Stage A: LLM-powered assessment
├── schemas.py              # JSON validation schemas
└── tests/

Quick Start

Full Validation Pipeline

from packages.exploitability_validation import run_validation

# Run full pipeline
result = run_validation(
    target_path="/path/to/code",
    vuln_type="sql_injection",
    findings_file=None  # Will run static analysis first
)

print(f"Status: {result.state}")
print(f"Stages completed: {len(result.stage_results)}")

With Pre-Existing Findings

# Skip scanning, validate existing findings
result = run_validation(
    target_path="/path/to/code",
    vuln_type="command_injection",
    findings_file="out/scan/combined.sarif"
)

Build Inventory Only (Stage 0)

from packages.exploitability_validation import build_checklist

# Create code inventory
checklist = build_checklist(
    target_path="/path/to/code",
    output_dir="out/validation"
)

print(f"Files: {len(checklist['files'])}")
print(f"Functions: {len(checklist['functions'])}")
print(f"Entry points: {len(checklist['entry_points'])}")

Python API

ValidationOrchestrator

Main pipeline orchestrator.
from packages.exploitability_validation import (
    ValidationOrchestrator,
    PipelineConfig
)

# Configure pipeline
config = PipelineConfig(
    target_path="/path/to/code",
    workdir=".out/validation-20260304",
    vuln_type="sql_injection",
    findings_file=None,
    skip_feasibility=False,
    max_retries=3,
    validate_schemas=True
)

# Run pipeline
orchestrator = ValidationOrchestrator(config)
state = orchestrator.run()

# Check results
for stage, result in state.stage_results.items():
    print(f"{stage.name}: {result.status}")
    if result.errors:
        print(f"  Errors: {result.errors}")

Stage Execution

from packages.exploitability_validation import Stage

# Run specific stage
orchestrator = ValidationOrchestrator(config)
result = orchestrator._execute_stage(Stage.ONESHOT)

if result.status == StageStatus.COMPLETED:
    print(f"Stage completed in {result.duration_seconds:.1f}s")
    print(f"Output files: {result.output_files}")

Agentic Validation (Stage A)

from packages.exploitability_validation import run_validation_phase

# Run one-shot LLM-powered validation
result = run_validation_phase(
    stage="A",
    workdir=".out/validation",
    target_path="/path/to/code",
    vuln_type="sql_injection"
)

print(f"Findings validated: {len(result['findings'])}")
for finding in result['findings']:
    print(f"  {finding['id']}: {finding['status']}")

Core Classes

ValidationOrchestrator

Orchestrates multi-stage validation.
class ValidationOrchestrator:
    def __init__(self, config: PipelineConfig)
    
    def run(self) -> PipelineState
    
    def _execute_stage(
        self,
        stage: Stage
    ) -> StageResult
    
    def _attempt_recovery(
        self,
        stage: Stage,
        result: StageResult
    ) -> bool

PipelineConfig

Pipeline configuration.
target_path
str
required
Path to code repository
workdir
str
required
Working directory for outputs
vuln_type
Optional[str]
Vulnerability type to validate (e.g., “sql_injection”)
binary_path
Optional[str]
Binary path (required for Stage E)
findings_file
Optional[str]
Pre-existing SARIF findings file
skip_feasibility
bool
default:"False"
Skip Stage E (exploit feasibility)
max_retries
int
default:"3"
Max retry attempts per stage
validate_schemas
bool
default:"True"
Enable JSON schema validation

PipelineState

Current pipeline state.
current_stage
Optional[Stage]
Currently executing stage
stage_results
Dict[Stage, StageResult]
Results for completed stages
checklist
Optional[Dict]
Code inventory (Stage 0)
findings
Optional[Dict]
Validated findings
attack_tree
Optional[Dict]
Attack tree structure (Stage B)
attack_paths
Optional[List]
Validated attack paths (Stage B)

Pipeline Stages

Stage 0: Inventory

Catalog code structure.
from packages.exploitability_validation import build_checklist

checklist = build_checklist(
    target_path="/path/to/code",
    output_dir="out/validation"
)

# Checklist contains:
# - files: All source files
# - functions: Extracted function definitions
# - entry_points: Main, handlers, routes
# - dependencies: Imported modules
# - language: Detected language
Output: checklist.json

Stage A: One-Shot

Quick LLM-powered exploitability assessment.
from packages.exploitability_validation import run_validation_phase

result = run_validation_phase(
    stage="A",
    workdir=".out/validation",
    target_path="/path/to/code",
    vuln_type="sql_injection"
)

# Classifies findings as:
# - "Exploitable": High confidence exploitable
# - "Confirmed (constrained)": Exploitable with constraints
# - "Confirmed (blocked)": Real but not exploitable
# - "Ruled out": False positive
Output: findings.json with status annotations

Stage B: Process

Validate attack paths through code.
# Builds attack tree and validates paths
# - Source: User input location
# - Steps: Intermediate functions
# - Sink: Vulnerable function call
# - Proximity: Distance metric to sink
Output: attack-tree.json, attack-paths.json

Stage C: Sanity

Eliminate false positives.
# Checks for:
# - Input validation
# - Sanitization functions
# - Safe API usage
# - Framework protections
Output: Updated findings.json with sanity checks

Stage D: Ruling

Validate hypotheses with evidence.
# Validates:
# - Dataflow paths are real
# - No blocking conditions
# - Exploitability assumptions
Output: hypotheses.json, disproven.json

Stage E: Feasibility

Binary exploit feasibility (memory corruption only).
from packages.exploit_feasibility import analyze_binary

# Only runs for memory corruption types:
# - buffer_overflow, heap_overflow, stack_overflow
# - format_string, use_after_free, double_free
# - integer_overflow, out_of_bounds_*

result = analyze_binary("/path/to/binary")

# Checks:
# - Mitigations (PIE, RELRO, canary, NX)
# - ROP gadgets availability
# - Null byte constraints
# - glibc protections
Output: feasibility.json

Validation Workflow

Complete Validation

from packages.exploitability_validation import run_validation
from pathlib import Path
import json

# Run full pipeline
result = run_validation(
    target_path="/path/to/vulnerable/app",
    vuln_type="sql_injection"
)

# Load results
workdir = Path(result.config.workdir)

# Checklist (Stage 0)
with open(workdir / "checklist.json") as f:
    checklist = json.load(f)
    print(f"Analyzed {len(checklist['files'])} files")

# Findings (Stage A)
with open(workdir / "findings.json") as f:
    findings = json.load(f)
    exploitable = [f for f in findings['findings'] if f['status'] == 'Exploitable']
    print(f"Exploitable: {len(exploitable)} / {len(findings['findings'])}")

# Attack paths (Stage B)
if (workdir / "attack-paths.json").exists():
    with open(workdir / "attack-paths.json") as f:
        paths = json.load(f)
        print(f"Valid attack paths: {len(paths['paths'])}")

# Validation report
with open(workdir / "validation-report.md") as f:
    report = f.read()
    print(f"\nReport:\n{report}")

Schema Validation

Validate Outputs

from packages.exploitability_validation import (
    validate_checklist,
    validate_findings,
    validate_attack_tree,
    validate_attack_paths
)

# Validate checklist
is_valid, errors = validate_checklist(checklist_data)
if not is_valid:
    print(f"Checklist validation failed: {errors}")

# Validate findings
is_valid, errors = validate_findings(findings_data)
if not is_valid:
    print(f"Findings validation failed: {errors}")

Create Empty Structures

from packages.exploitability_validation import (
    create_empty_checklist,
    create_empty_findings,
    create_finding
)

# Create empty checklist
checklist = create_empty_checklist("/path/to/code")

# Create empty findings
findings = create_empty_findings()

# Create single finding
finding = create_finding(
    finding_id="sqli-001",
    rule_id="sql-injection",
    file="src/api/users.py",
    start_line=45,
    message="SQL injection vulnerability"
)

Vulnerability Types

Supported Types

from packages.exploitability_validation import VULN_TYPE_MAP

# Maps SARIF rule IDs to vulnerability types
print(VULN_TYPE_MAP)

# Example mappings:
# "sql-injection" -> "sql_injection"
# "command-injection" -> "command_injection"
# "buffer-overflow" -> "buffer_overflow"
# "xss" -> "cross_site_scripting"

Memory Corruption Types

from packages.exploitability_validation import MEMORY_CORRUPTION_TYPES

# These require Stage E (binary analysis)
print(MEMORY_CORRUPTION_TYPES)
# {'buffer_overflow', 'heap_overflow', 'stack_overflow',
#  'format_string', 'use_after_free', 'double_free',
#  'integer_overflow', 'out_of_bounds_read', 'out_of_bounds_write'}

Configuration

Stage-Specific Config

config = PipelineConfig(
    target_path="/path/to/code",
    workdir=".out/validation",
    
    # Stage B config
    stage_b_max_attempts=5,          # Max path validation attempts
    stage_b_proximity_threshold=3,   # Min proximity to continue
    
    # General config
    max_retries=3,                   # Retry failed stages
    validate_schemas=True            # Enable validation
)

Environment Variables

# LLM configuration (for Stage A)
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

# Binary analysis (for Stage E)
export BINARY_ANALYSIS_TIMEOUT=300

Output Structure

.out/exploitability-validation-20260304/
├── checklist.json             # Stage 0: Code inventory
├── findings.json              # Stage A: Validated findings
├── attack-tree.json           # Stage B: Attack tree
├── attack-paths.json          # Stage B: Valid paths
├── attack-surface.json        # Stage B: Surface analysis
├── hypotheses.json            # Stage D: Hypotheses
├── disproven.json             # Stage D: Disproven items
├── feasibility.json           # Stage E: Binary analysis
├── validation-report.md       # Human-readable report
└── pipeline-state.json        # Pipeline state

Integration

With Static Analysis

from packages.static_analysis import main as scan_repo
from packages.exploitability_validation import run_validation

# 1. Scan
scan_repo()  # Generates SARIF

# 2. Validate
result = run_validation(
    target_path="/path/to/code",
    findings_file="out/scan/combined.sarif"
)

With LLM Analysis

from packages.llm_analysis import AutonomousSecurityAgentV2

# Validation -> LLM Analysis
for finding in validated_findings:
    if finding['status'] == 'Exploitable':
        agent.generate_exploit(finding)

Performance

Pipeline Duration

  • Stage 0 (Inventory): 5-30 seconds
  • Stage A (One-Shot): 1-3 minutes
  • Stage B (Process): 2-5 minutes
  • Stage C (Sanity): 30-60 seconds
  • Stage D (Ruling): 1-2 minutes
  • Stage E (Feasibility): 10-30 seconds
Total: 5-12 minutes per vulnerability type

Best Practices

  1. Start with Stage A for quick triage
  2. Use pre-existing findings to skip scanning
  3. Enable schema validation to catch errors early
  4. Review validation report for human verification
  5. Run Stage E only for memory corruption vulnerabilities

Build docs developers (and LLMs) love