Overview
The Exploitability Validation package provides a rigorous multi-stage pipeline for validating that vulnerability findings from static analysis are not just theoretical issues, but real, reachable, and potentially exploitable vulnerabilities.
Purpose
Validate findings through stages:
- Stage 0 (Inventory): Catalog code structure and entry points
- Stage A (One-Shot): Quick exploitability assessment
- Stage B (Process): Attack path validation
- Stage C (Sanity): False positive elimination
- Stage D (Ruling): Hypothesis validation
- Stage E (Feasibility): Binary exploit feasibility (memory corruption only)
Architecture
packages/exploitability_validation/
├── orchestrator.py # Pipeline orchestration
├── checklist_builder.py # Stage 0: Code inventory
├── agentic.py # Stage A: LLM-powered assessment
├── schemas.py # JSON validation schemas
└── tests/
Quick Start
Full Validation Pipeline
from packages.exploitability_validation import run_validation
# Run full pipeline
result = run_validation(
target_path="/path/to/code",
vuln_type="sql_injection",
findings_file=None # Will run static analysis first
)
print(f"Status: {result.state}")
print(f"Stages completed: {len(result.stage_results)}")
With Pre-Existing Findings
# Skip scanning, validate existing findings
result = run_validation(
target_path="/path/to/code",
vuln_type="command_injection",
findings_file="out/scan/combined.sarif"
)
Build Inventory Only (Stage 0)
from packages.exploitability_validation import build_checklist
# Create code inventory
checklist = build_checklist(
target_path="/path/to/code",
output_dir="out/validation"
)
print(f"Files: {len(checklist['files'])}")
print(f"Functions: {len(checklist['functions'])}")
print(f"Entry points: {len(checklist['entry_points'])}")
Python API
ValidationOrchestrator
Main pipeline orchestrator.
from packages.exploitability_validation import (
ValidationOrchestrator,
PipelineConfig
)
# Configure pipeline
config = PipelineConfig(
target_path="/path/to/code",
workdir=".out/validation-20260304",
vuln_type="sql_injection",
findings_file=None,
skip_feasibility=False,
max_retries=3,
validate_schemas=True
)
# Run pipeline
orchestrator = ValidationOrchestrator(config)
state = orchestrator.run()
# Check results
for stage, result in state.stage_results.items():
print(f"{stage.name}: {result.status}")
if result.errors:
print(f" Errors: {result.errors}")
Stage Execution
from packages.exploitability_validation import Stage
# Run specific stage
orchestrator = ValidationOrchestrator(config)
result = orchestrator._execute_stage(Stage.ONESHOT)
if result.status == StageStatus.COMPLETED:
print(f"Stage completed in {result.duration_seconds:.1f}s")
print(f"Output files: {result.output_files}")
Agentic Validation (Stage A)
from packages.exploitability_validation import run_validation_phase
# Run one-shot LLM-powered validation
result = run_validation_phase(
stage="A",
workdir=".out/validation",
target_path="/path/to/code",
vuln_type="sql_injection"
)
print(f"Findings validated: {len(result['findings'])}")
for finding in result['findings']:
print(f" {finding['id']}: {finding['status']}")
Core Classes
ValidationOrchestrator
Orchestrates multi-stage validation.
class ValidationOrchestrator:
def __init__(self, config: PipelineConfig)
def run(self) -> PipelineState
def _execute_stage(
self,
stage: Stage
) -> StageResult
def _attempt_recovery(
self,
stage: Stage,
result: StageResult
) -> bool
PipelineConfig
Pipeline configuration.
Working directory for outputs
Vulnerability type to validate (e.g., “sql_injection”)
Binary path (required for Stage E)
Pre-existing SARIF findings file
Skip Stage E (exploit feasibility)
Max retry attempts per stage
Enable JSON schema validation
PipelineState
Current pipeline state.
Currently executing stage
Results for completed stages
Attack tree structure (Stage B)
Validated attack paths (Stage B)
Pipeline Stages
Stage 0: Inventory
Catalog code structure.
from packages.exploitability_validation import build_checklist
checklist = build_checklist(
target_path="/path/to/code",
output_dir="out/validation"
)
# Checklist contains:
# - files: All source files
# - functions: Extracted function definitions
# - entry_points: Main, handlers, routes
# - dependencies: Imported modules
# - language: Detected language
Output: checklist.json
Stage A: One-Shot
Quick LLM-powered exploitability assessment.
from packages.exploitability_validation import run_validation_phase
result = run_validation_phase(
stage="A",
workdir=".out/validation",
target_path="/path/to/code",
vuln_type="sql_injection"
)
# Classifies findings as:
# - "Exploitable": High confidence exploitable
# - "Confirmed (constrained)": Exploitable with constraints
# - "Confirmed (blocked)": Real but not exploitable
# - "Ruled out": False positive
Output: findings.json with status annotations
Stage B: Process
Validate attack paths through code.
# Builds attack tree and validates paths
# - Source: User input location
# - Steps: Intermediate functions
# - Sink: Vulnerable function call
# - Proximity: Distance metric to sink
Output: attack-tree.json, attack-paths.json
Stage C: Sanity
Eliminate false positives.
# Checks for:
# - Input validation
# - Sanitization functions
# - Safe API usage
# - Framework protections
Output: Updated findings.json with sanity checks
Stage D: Ruling
Validate hypotheses with evidence.
# Validates:
# - Dataflow paths are real
# - No blocking conditions
# - Exploitability assumptions
Output: hypotheses.json, disproven.json
Stage E: Feasibility
Binary exploit feasibility (memory corruption only).
from packages.exploit_feasibility import analyze_binary
# Only runs for memory corruption types:
# - buffer_overflow, heap_overflow, stack_overflow
# - format_string, use_after_free, double_free
# - integer_overflow, out_of_bounds_*
result = analyze_binary("/path/to/binary")
# Checks:
# - Mitigations (PIE, RELRO, canary, NX)
# - ROP gadgets availability
# - Null byte constraints
# - glibc protections
Output: feasibility.json
Validation Workflow
Complete Validation
from packages.exploitability_validation import run_validation
from pathlib import Path
import json
# Run full pipeline
result = run_validation(
target_path="/path/to/vulnerable/app",
vuln_type="sql_injection"
)
# Load results
workdir = Path(result.config.workdir)
# Checklist (Stage 0)
with open(workdir / "checklist.json") as f:
checklist = json.load(f)
print(f"Analyzed {len(checklist['files'])} files")
# Findings (Stage A)
with open(workdir / "findings.json") as f:
findings = json.load(f)
exploitable = [f for f in findings['findings'] if f['status'] == 'Exploitable']
print(f"Exploitable: {len(exploitable)} / {len(findings['findings'])}")
# Attack paths (Stage B)
if (workdir / "attack-paths.json").exists():
with open(workdir / "attack-paths.json") as f:
paths = json.load(f)
print(f"Valid attack paths: {len(paths['paths'])}")
# Validation report
with open(workdir / "validation-report.md") as f:
report = f.read()
print(f"\nReport:\n{report}")
Schema Validation
Validate Outputs
from packages.exploitability_validation import (
validate_checklist,
validate_findings,
validate_attack_tree,
validate_attack_paths
)
# Validate checklist
is_valid, errors = validate_checklist(checklist_data)
if not is_valid:
print(f"Checklist validation failed: {errors}")
# Validate findings
is_valid, errors = validate_findings(findings_data)
if not is_valid:
print(f"Findings validation failed: {errors}")
Create Empty Structures
from packages.exploitability_validation import (
create_empty_checklist,
create_empty_findings,
create_finding
)
# Create empty checklist
checklist = create_empty_checklist("/path/to/code")
# Create empty findings
findings = create_empty_findings()
# Create single finding
finding = create_finding(
finding_id="sqli-001",
rule_id="sql-injection",
file="src/api/users.py",
start_line=45,
message="SQL injection vulnerability"
)
Vulnerability Types
Supported Types
from packages.exploitability_validation import VULN_TYPE_MAP
# Maps SARIF rule IDs to vulnerability types
print(VULN_TYPE_MAP)
# Example mappings:
# "sql-injection" -> "sql_injection"
# "command-injection" -> "command_injection"
# "buffer-overflow" -> "buffer_overflow"
# "xss" -> "cross_site_scripting"
Memory Corruption Types
from packages.exploitability_validation import MEMORY_CORRUPTION_TYPES
# These require Stage E (binary analysis)
print(MEMORY_CORRUPTION_TYPES)
# {'buffer_overflow', 'heap_overflow', 'stack_overflow',
# 'format_string', 'use_after_free', 'double_free',
# 'integer_overflow', 'out_of_bounds_read', 'out_of_bounds_write'}
Configuration
Stage-Specific Config
config = PipelineConfig(
target_path="/path/to/code",
workdir=".out/validation",
# Stage B config
stage_b_max_attempts=5, # Max path validation attempts
stage_b_proximity_threshold=3, # Min proximity to continue
# General config
max_retries=3, # Retry failed stages
validate_schemas=True # Enable validation
)
Environment Variables
# LLM configuration (for Stage A)
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
# Binary analysis (for Stage E)
export BINARY_ANALYSIS_TIMEOUT=300
Output Structure
.out/exploitability-validation-20260304/
├── checklist.json # Stage 0: Code inventory
├── findings.json # Stage A: Validated findings
├── attack-tree.json # Stage B: Attack tree
├── attack-paths.json # Stage B: Valid paths
├── attack-surface.json # Stage B: Surface analysis
├── hypotheses.json # Stage D: Hypotheses
├── disproven.json # Stage D: Disproven items
├── feasibility.json # Stage E: Binary analysis
├── validation-report.md # Human-readable report
└── pipeline-state.json # Pipeline state
Integration
With Static Analysis
from packages.static_analysis import main as scan_repo
from packages.exploitability_validation import run_validation
# 1. Scan
scan_repo() # Generates SARIF
# 2. Validate
result = run_validation(
target_path="/path/to/code",
findings_file="out/scan/combined.sarif"
)
With LLM Analysis
from packages.llm_analysis import AutonomousSecurityAgentV2
# Validation -> LLM Analysis
for finding in validated_findings:
if finding['status'] == 'Exploitable':
agent.generate_exploit(finding)
Pipeline Duration
- Stage 0 (Inventory): 5-30 seconds
- Stage A (One-Shot): 1-3 minutes
- Stage B (Process): 2-5 minutes
- Stage C (Sanity): 30-60 seconds
- Stage D (Ruling): 1-2 minutes
- Stage E (Feasibility): 10-30 seconds
Total: 5-12 minutes per vulnerability type
Best Practices
- Start with Stage A for quick triage
- Use pre-existing findings to skip scanning
- Enable schema validation to catch errors early
- Review validation report for human verification
- Run Stage E only for memory corruption vulnerabilities