Exploitability Validation

Overview

The Exploitability Validation package provides a rigorous multi-stage pipeline for validating that vulnerability findings from static analysis are not just theoretical issues, but real, reachable, and potentially exploitable vulnerabilities.

Purpose

Validate findings through stages:

Stage 0 (Inventory): Catalog code structure and entry points
Stage A (One-Shot): Quick exploitability assessment
Stage B (Process): Attack path validation
Stage C (Sanity): False positive elimination
Stage D (Ruling): Hypothesis validation
Stage E (Feasibility): Binary exploit feasibility (memory corruption only)

Architecture

packages/exploitability_validation/
├── orchestrator.py         # Pipeline orchestration
├── checklist_builder.py    # Stage 0: Code inventory
├── agentic.py              # Stage A: LLM-powered assessment
├── schemas.py              # JSON validation schemas
└── tests/

Quick Start

Full Validation Pipeline

from packages.exploitability_validation import run_validation

# Run full pipeline
result = run_validation(
    target_path="/path/to/code",
    vuln_type="sql_injection",
    findings_file=None  # Will run static analysis first
)

print(f"Status: {result.state}")
print(f"Stages completed: {len(result.stage_results)}")

With Pre-Existing Findings

# Skip scanning, validate existing findings
result = run_validation(
    target_path="/path/to/code",
    vuln_type="command_injection",
    findings_file="out/scan/combined.sarif"
)

Build Inventory Only (Stage 0)

from packages.exploitability_validation import build_checklist

# Create code inventory
checklist = build_checklist(
    target_path="/path/to/code",
    output_dir="out/validation"
)

print(f"Files: {len(checklist['files'])}")
print(f"Functions: {len(checklist['functions'])}")
print(f"Entry points: {len(checklist['entry_points'])}")

Python API

ValidationOrchestrator

Main pipeline orchestrator.

from packages.exploitability_validation import (
    ValidationOrchestrator,
    PipelineConfig
)

# Configure pipeline
config = PipelineConfig(
    target_path="/path/to/code",
    workdir=".out/validation-20260304",
    vuln_type="sql_injection",
    findings_file=None,
    skip_feasibility=False,
    max_retries=3,
    validate_schemas=True
)

# Run pipeline
orchestrator = ValidationOrchestrator(config)
state = orchestrator.run()

# Check results
for stage, result in state.stage_results.items():
    print(f"{stage.name}: {result.status}")
    if result.errors:
        print(f"  Errors: {result.errors}")

Stage Execution

from packages.exploitability_validation import Stage

# Run specific stage
orchestrator = ValidationOrchestrator(config)
result = orchestrator._execute_stage(Stage.ONESHOT)

if result.status == StageStatus.COMPLETED:
    print(f"Stage completed in {result.duration_seconds:.1f}s")
    print(f"Output files: {result.output_files}")

Agentic Validation (Stage A)

from packages.exploitability_validation import run_validation_phase

# Run one-shot LLM-powered validation
result = run_validation_phase(
    stage="A",
    workdir=".out/validation",
    target_path="/path/to/code",
    vuln_type="sql_injection"
)

print(f"Findings validated: {len(result['findings'])}")
for finding in result['findings']:
    print(f"  {finding['id']}: {finding['status']}")

Core Classes

ValidationOrchestrator

Orchestrates multi-stage validation.

class ValidationOrchestrator:
    def __init__(self, config: PipelineConfig)
    
    def run(self) -> PipelineState
    
    def _execute_stage(
        self,
        stage: Stage
    ) -> StageResult
    
    def _attempt_recovery(
        self,
        stage: Stage,
        result: StageResult
    ) -> bool

PipelineConfig

Pipeline configuration.

target_path

str

required

Path to code repository

workdir

str

required

Working directory for outputs

vuln_type

Optional[str]

Vulnerability type to validate (e.g., “sql_injection”)

binary_path

Optional[str]

Binary path (required for Stage E)

findings_file

Optional[str]

Pre-existing SARIF findings file

skip_feasibility

bool

default:"False"

Skip Stage E (exploit feasibility)

max_retries

int

default:"3"

Max retry attempts per stage

validate_schemas

bool

default:"True"

Enable JSON schema validation

PipelineState

Current pipeline state.

current_stage

Optional[Stage]

Currently executing stage

stage_results

Dict[Stage, StageResult]

Results for completed stages

checklist

Optional[Dict]

Code inventory (Stage 0)

findings

Optional[Dict]

Validated findings

attack_tree

Optional[Dict]

Attack tree structure (Stage B)

attack_paths

Optional[List]

Validated attack paths (Stage B)

Pipeline Stages

Stage 0: Inventory

Catalog code structure.

from packages.exploitability_validation import build_checklist

checklist = build_checklist(
    target_path="/path/to/code",
    output_dir="out/validation"
)

# Checklist contains:
# - files: All source files
# - functions: Extracted function definitions
# - entry_points: Main, handlers, routes
# - dependencies: Imported modules
# - language: Detected language

Output: checklist.json

Stage A: One-Shot

Quick LLM-powered exploitability assessment.

from packages.exploitability_validation import run_validation_phase

result = run_validation_phase(
    stage="A",
    workdir=".out/validation",
    target_path="/path/to/code",
    vuln_type="sql_injection"
)

# Classifies findings as:
# - "Exploitable": High confidence exploitable
# - "Confirmed (constrained)": Exploitable with constraints
# - "Confirmed (blocked)": Real but not exploitable
# - "Ruled out": False positive

Output: findings.json with status annotations

Stage B: Process

Validate attack paths through code.

# Builds attack tree and validates paths
# - Source: User input location
# - Steps: Intermediate functions
# - Sink: Vulnerable function call
# - Proximity: Distance metric to sink

Output: attack-tree.json, attack-paths.json

Stage C: Sanity

Eliminate false positives.

# Checks for:
# - Input validation
# - Sanitization functions
# - Safe API usage
# - Framework protections

Output: Updated findings.json with sanity checks

Stage D: Ruling

Validate hypotheses with evidence.

# Validates:
# - Dataflow paths are real
# - No blocking conditions
# - Exploitability assumptions

Output: hypotheses.json, disproven.json

Stage E: Feasibility

Binary exploit feasibility (memory corruption only).

from packages.exploit_feasibility import analyze_binary

# Only runs for memory corruption types:
# - buffer_overflow, heap_overflow, stack_overflow
# - format_string, use_after_free, double_free
# - integer_overflow, out_of_bounds_*

result = analyze_binary("/path/to/binary")

# Checks:
# - Mitigations (PIE, RELRO, canary, NX)
# - ROP gadgets availability
# - Null byte constraints
# - glibc protections

Output: feasibility.json

Validation Workflow

Complete Validation

from packages.exploitability_validation import run_validation
from pathlib import Path
import json

# Run full pipeline
result = run_validation(
    target_path="/path/to/vulnerable/app",
    vuln_type="sql_injection"
)

# Load results
workdir = Path(result.config.workdir)

# Checklist (Stage 0)
with open(workdir / "checklist.json") as f:
    checklist = json.load(f)
    print(f"Analyzed {len(checklist['files'])} files")

# Findings (Stage A)
with open(workdir / "findings.json") as f:
    findings = json.load(f)
    exploitable = [f for f in findings['findings'] if f['status'] == 'Exploitable']
    print(f"Exploitable: {len(exploitable)} / {len(findings['findings'])}")

# Attack paths (Stage B)
if (workdir / "attack-paths.json").exists():
    with open(workdir / "attack-paths.json") as f:
        paths = json.load(f)
        print(f"Valid attack paths: {len(paths['paths'])}")

# Validation report
with open(workdir / "validation-report.md") as f:
    report = f.read()
    print(f"\nReport:\n{report}")

Schema Validation

Validate Outputs

from packages.exploitability_validation import (
    validate_checklist,
    validate_findings,
    validate_attack_tree,
    validate_attack_paths
)

# Validate checklist
is_valid, errors = validate_checklist(checklist_data)
if not is_valid:
    print(f"Checklist validation failed: {errors}")

# Validate findings
is_valid, errors = validate_findings(findings_data)
if not is_valid:
    print(f"Findings validation failed: {errors}")

Create Empty Structures

from packages.exploitability_validation import (
    create_empty_checklist,
    create_empty_findings,
    create_finding
)

# Create empty checklist
checklist = create_empty_checklist("/path/to/code")

# Create empty findings
findings = create_empty_findings()

# Create single finding
finding = create_finding(
    finding_id="sqli-001",
    rule_id="sql-injection",
    file="src/api/users.py",
    start_line=45,
    message="SQL injection vulnerability"
)

Vulnerability Types

Supported Types

from packages.exploitability_validation import VULN_TYPE_MAP

# Maps SARIF rule IDs to vulnerability types
print(VULN_TYPE_MAP)

# Example mappings:
# "sql-injection" -> "sql_injection"
# "command-injection" -> "command_injection"
# "buffer-overflow" -> "buffer_overflow"
# "xss" -> "cross_site_scripting"

Memory Corruption Types

from packages.exploitability_validation import MEMORY_CORRUPTION_TYPES

# These require Stage E (binary analysis)
print(MEMORY_CORRUPTION_TYPES)
# {'buffer_overflow', 'heap_overflow', 'stack_overflow',
#  'format_string', 'use_after_free', 'double_free',
#  'integer_overflow', 'out_of_bounds_read', 'out_of_bounds_write'}

Configuration

Stage-Specific Config

config = PipelineConfig(
    target_path="/path/to/code",
    workdir=".out/validation",
    
    # Stage B config
    stage_b_max_attempts=5,          # Max path validation attempts
    stage_b_proximity_threshold=3,   # Min proximity to continue
    
    # General config
    max_retries=3,                   # Retry failed stages
    validate_schemas=True            # Enable validation
)

Environment Variables

# LLM configuration (for Stage A)
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

# Binary analysis (for Stage E)
export BINARY_ANALYSIS_TIMEOUT=300

Output Structure

.out/exploitability-validation-20260304/
├── checklist.json             # Stage 0: Code inventory
├── findings.json              # Stage A: Validated findings
├── attack-tree.json           # Stage B: Attack tree
├── attack-paths.json          # Stage B: Valid paths
├── attack-surface.json        # Stage B: Surface analysis
├── hypotheses.json            # Stage D: Hypotheses
├── disproven.json             # Stage D: Disproven items
├── feasibility.json           # Stage E: Binary analysis
├── validation-report.md       # Human-readable report
└── pipeline-state.json        # Pipeline state

Integration

With Static Analysis

from packages.static_analysis import main as scan_repo
from packages.exploitability_validation import run_validation

# 1. Scan
scan_repo()  # Generates SARIF

# 2. Validate
result = run_validation(
    target_path="/path/to/code",
    findings_file="out/scan/combined.sarif"
)

With LLM Analysis

from packages.llm_analysis import AutonomousSecurityAgentV2

# Validation -> LLM Analysis
for finding in validated_findings:
    if finding['status'] == 'Exploitable':
        agent.generate_exploit(finding)

Static Analysis - Generate findings to validate
CodeQL - Advanced dataflow analysis
LLM Analysis - AI-powered validation
Binary Analysis - Stage E feasibility

Performance

Pipeline Duration

Stage 0 (Inventory): 5-30 seconds
Stage A (One-Shot): 1-3 minutes
Stage B (Process): 2-5 minutes
Stage C (Sanity): 30-60 seconds
Stage D (Ruling): 1-2 minutes
Stage E (Feasibility): 10-30 seconds

Total: 5-12 minutes per vulnerability type

Best Practices

Start with Stage A for quick triage
Use pre-existing findings to skip scanning
Enable schema validation to catch errors early
Review validation report for human verification
Run Stage E only for memory corruption vulnerabilities

Commands

Packages

Agents

Expert Personas

​Overview

​Purpose

​Architecture

​Quick Start

​Full Validation Pipeline

​With Pre-Existing Findings

​Build Inventory Only (Stage 0)

​Python API

​ValidationOrchestrator

​Stage Execution

​Agentic Validation (Stage A)

​Core Classes

​ValidationOrchestrator

​PipelineConfig

​PipelineState

​Pipeline Stages

​Stage 0: Inventory

​Stage A: One-Shot

​Stage B: Process

​Stage C: Sanity

​Stage D: Ruling

​Stage E: Feasibility

​Validation Workflow

​Complete Validation

​Schema Validation

​Validate Outputs

​Create Empty Structures

​Vulnerability Types

​Supported Types

​Memory Corruption Types

​Configuration

​Stage-Specific Config

​Environment Variables

​Output Structure

​Integration

​With Static Analysis

​With LLM Analysis

​Related Packages

​Performance

​Pipeline Duration

​Best Practices

Build docs developers (and LLMs) love

Overview

Purpose

Architecture

Quick Start

Full Validation Pipeline

With Pre-Existing Findings

Build Inventory Only (Stage 0)

Python API

ValidationOrchestrator

Stage Execution

Agentic Validation (Stage A)

Core Classes

ValidationOrchestrator

PipelineConfig

PipelineState

Pipeline Stages

Stage 0: Inventory

Stage A: One-Shot

Stage B: Process

Stage C: Sanity

Stage D: Ruling

Stage E: Feasibility

Validation Workflow

Complete Validation

Schema Validation

Validate Outputs

Create Empty Structures

Vulnerability Types

Supported Types

Memory Corruption Types

Configuration

Stage-Specific Config

Environment Variables

Output Structure

Integration

With Static Analysis

With LLM Analysis

Related Packages

Performance

Pipeline Duration

Best Practices