The Exploitability Validator agent orchestrates a multi-stage pipeline that validates vulnerability findings before exploit development, preventing wasted effort on false positives and theoretical vulnerabilities.
Purpose
Validate that findings:
Actually exist (not hallucinated)
Are reachable (not dead code)
Have working exploitation paths (not just theoretical)
Invocation
/validate < target_pat h > [--vuln-type < typ e > ] [--findings < findings.jso n > ]
Parameters :
target_path: Directory or file to analyze
--vuln-type: Optional focus (e.g., command_injection, sql_injection, xss)
--findings: Optional pre-existing findings to validate (skips Stage 0/A)
Examples :
/validate /home/user/webapp --vuln-type command_injection
/validate /home/user/binary_app --vuln-type format_string
/validate /path/to/code --findings initial_scan.json
Pipeline Stages
Stage 0: Inventory
Build complete function inventory with checklist.json
Stage A: One-Shot
Quick verification - attempt PoC for each candidate
Stage B: Systematic Process
Build attack trees and test hypotheses for unproven findings
Stage C: Sanity Check
Validate findings against actual code to catch hallucinations
Stage D: Ruling
Filter out test code and unrealistic preconditions
Stage E: Feasibility
Run exploit feasibility analysis for memory corruption (binary analysis)
Final Report
Generate comprehensive validation report
Shared Context (MUST-GATEs)
Before executing ANY stage, load:
.claude/skills/exploitability-validation/SKILL.md
This contains:
[CONFIG] : Configuration settings
[EXEC] : Execution rules
[GATES] : MUST-GATEs 1-6 that apply to ALL stages
[REMIND] : Critical reminders
MUST-GATEs Overview
GATE-1: Assume Exploitable
Treat all findings as exploitable until proven otherwise. Burden of proof is on disproving, not proving.
Verify all uncertain claims. No “likely”, “probably”, “appears to” without verification.
GATE-3: Document Everything
Update working documents after every action. Maintain audit trail.
Every claim needs evidence. No assumptions without verification.
Check ALL code per checklist.json. No random sampling or incomplete coverage.
Working PoC or concrete disproof required. No theoretical assessments.
Stage Details
Stage 0: Inventory
Load : .claude/skills/exploitability-validation/stage-0-inventory.md
Execution :
Enumerate all files in target_path
Exclude test/mock files
Extract functions per file
Write checklist.json
Output : checklist.json with complete function inventory
Stage A: One-Shot
Load : .claude/skills/exploitability-validation/stage-a-oneshot.md
Execution :
Assess each function for vuln_type
Attempt PoC for candidates
Write findings.json
Routing :
All PoCs succeed → Skip to Stage C
Some “not_disproven” → Continue to Stage B
All disproven → Report “no exploitable findings” and exit
Stage B: Systematic Process
Load : .claude/skills/exploitability-validation/stage-b-process.md
Execution :
Build attack trees for “not_disproven” findings
Form and test hypotheses
Track PROXIMITY metrics
Attempt multiple attack paths
Update working documents
Output :
findings.json (updated)
attack-tree.json
hypotheses.json
disproven.json
attack-paths.json
attack-surface.json
Stage C: Sanity Check
Load : .claude/skills/exploitability-validation/stage-c-sanity.md
Execution :
Verify files exist
Verify code matches verbatim
Verify flow is real
Verify code is reachable
Updates : findings.json with sanity_check results
Removes findings that fail sanity check (hallucinations) from active consideration.
Stage D: Ruling
Load : .claude/skills/exploitability-validation/stage-d-ruling.md
Execution :
Check for test/mock/example code
Check for unrealistic preconditions
Check for hedging language
Output : findings.json with CONFIRMED findings only
Stage E: Feasibility (Memory Corruption Only)
Load : .claude/skills/exploitability-validation/stage-e-feasibility.md
Applies to :
buffer_overflow
heap_overflow
format_string
use_after_free
double_free
integer_overflow
out_of_bounds_read/write
Skip for :
command_injection
sql_injection
xss
path_traversal
ssrf
deserialization
Execution :
from packages.exploit_feasibility import (
analyze_binary,
format_analysis_summary,
save_exploit_context
)
for finding in confirmed_findings:
if finding.vuln_type in MEMORY_CORRUPTION_TYPES :
result = analyze_binary(binary_path, vuln_type = finding.vuln_type)
context_file = save_exploit_context(binary_path)
finding.feasibility = {
'verdict' : result.verdict, # Likely, Difficult, Unlikely
'chain_breaks' : result.chain_breaks,
'what_would_help' : result.what_would_help,
'context_file' : context_file
}
# Update final status
if result.verdict == 'Likely' :
finding.final_status = 'EXPLOITABLE'
elif result.verdict == 'Difficult' :
finding.final_status = 'CONFIRMED_CONSTRAINED'
else :
finding.final_status = 'CONFIRMED_BLOCKED'
Working Directory Structure
.out/exploitability-validation-20260304_140000/
├── checklist.json # Stage 0 output
├── findings.json # Updated through stages
├── attack-tree.json # Stage B
├── hypotheses.json # Stage B
├── disproven.json # Stage B
├── attack-paths.json # Stage B
├── attack-surface.json # Stage B
├── exploit-context.json # Stage E (if applicable)
└── validation-report.md # Final report
# Exploitability Validation Report
## Summary
- Target: < target_path >
- Vulnerability Type: < vuln_type >
- Timestamp: < timestamp >
## Results
- Total functions analyzed: N
- Initial candidates: N
- After Stage A (One-Shot): N confirmed, N not_disproven, N disproven
- After Stage B (Process): N confirmed, N disproven
- After Stage C (Sanity): N passed, N failed (hallucinations)
- After Stage D (Ruling): N confirmed, N ruled out
- After Stage E (Feasibility): N exploitable, N constrained, N blocked, N not applicable
## Confirmed Findings
### FIND-001: < vuln_type > in < file > : < line >
- Function: < function_name >
- Proof: < code snippet >
- PoC: < poc description >
- Final Status: < EXPLOITABLE|CONFIRMED_CONSTRAINED|CONFIRMED_BLOCKED|CONFIRMED >
- Feasibility: < verdict if memory corruption >
- Chain Breaks: < list if applicable >
- Recommendation: < next steps >
## Ruled Out Findings
< list with reasons >
## Coverage
- checklist.json compliance: X/Y functions checked
Example Executions
Web Vulnerability
Memory Corruption
/validate /home/user/webapp --vuln-type command_injection
Phase 0: Created .out/exploitability-validation-20260122-143022/
Phase 1: Stage 0 complete - 15 files, 42 functions in checklist.json
Phase 2: Stage A complete - 3 candidates, 1 PoC success, 2 not_disproven
Phase 3: Stage B complete - 1 more confirmed, 1 disproven
Phase 4: Stage C complete - 2/2 passed sanity check
Phase 5: Stage D complete - 2/2 confirmed
Phase 6: Stage E skipped (command_injection is not memory corruption )
Phase 7: Report written to validation-report.md
Result: 2 CONFIRMED command injection vulnerabilities
/validate /home/user/binary_app --vuln-type format_string
Phase 0: Created .out/exploitability-validation-20260122-150000/
Phase 1: Stage 0 complete - 8 files, 23 functions in checklist.json
Phase 2: Stage A complete - 1 candidate, PoC shows %p leak works
Phase 3: Stage B skipped (PoC success in Stage A )
Phase 4: Stage C complete - 1/1 passed sanity check
Phase 5: Stage D complete - 1/1 confirmed
Phase 6: Stage E - Running exploit feasibility analysis...
Binary: /home/user/binary_app/build/vuln
Verdict: Difficult
Chain breaks: Full RELRO (GOT blocked ), glibc 2.38 ( %n blocked )
What would help: Older glibc, info leak for ASLR bypass
Context saved: .out/.../exploit-context.json
Phase 7: Report written to validation-report.md
Result: 1 CONFIRMED_CONSTRAINED format string vulnerability
Recommendation: Focus on info leak, or test in Docker with glibc 2.31
Error Handling
File not found : Stop, report which file, ask user for correct path
Stage fails : Report which stage, what failed, offer to retry or skip
No findings : Report “no exploitable vulnerabilities found” (valid outcome)
Sanity check failures : Report as potential hallucinations, continue with valid findings
Integration with /agentic
The /agentic command now automatically runs exploitability validation (Phase 2) between scanning and analysis. Use --skip-validation to bypass.
OffSec Specialist Offensive security operations and vulnerability discovery
Crash Analysis Analyze crashes from fuzzing campaigns
Exploit Developer Generate working exploit proof-of-concepts
Binary Exploitation Specialist Binary exploit generation methodology