Overview
RAPTOR provides truly autonomous security analysis powered by Large Language Models (LLMs). The system analyzes vulnerabilities with deep context awareness, including dataflow paths, sanitizer effectiveness, and exploitability assessment - with no templates.This is autonomous analysis - the LLM makes real security engineering decisions based on actual code, not pattern matching.
Key Features
- LLM-Powered Analysis: Claude, GPT-4, or local models (Ollama/DeepSeek/Qwen)
- Dataflow Path Validation: Tracks tainted data from source to sink
- Sanitizer Bypass Detection: Identifies ineffective or bypassable mitigations
- Exploitability Scoring: Rates vulnerability severity and feasibility
- Cost Tracking: Built-in budget management for API usage
- Automatic Fallback: Gracefully handles model failures
Analysis Workflow
The autonomous agent follows this workflow:1. Context Extraction
The agent automatically extracts full context for each vulnerability:2. Dataflow Path Analysis
For vulnerabilities with dataflow paths, the agent traces data from source to sink:3. Deep Validation
The LLM performs rigorous validation to separate real vulnerabilities from false positives:Source Control Analysis
Is the data actually attacker-controlled?
- HTTP request, user input, file upload → Attacker controlled ✓
- Config file, environment variable → Requires access first ⚠️
- Hardcoded constant, internal variable → False positive ✗
Sanitizer Effectiveness
Can the sanitizers be bypassed?
- Analyze what each sanitizer actually does
- Check for encoding bypasses (URL encoding, double encoding)
- Verify coverage across all code paths
Reachability Analysis
Can an attacker trigger this code path?
- Check for authentication/authorization barriers
- Identify prerequisites that block exploitation
- Verify the code is used in production
LLM Provider Support
Anthropic Claude (Recommended)
OpenAI GPT-4
Ollama (Local/Free)
deepseek-coder:33b- Best for code analysisqwen2.5-coder:32b- Strong at security taskscodellama:70b- Meta’s code model
Configuration
LLM Config
The LLM client automatically handles model selection, fallback, and retry:Cost Tracking
Built-in cost tracking for budget management:Analysis Output
Structured Analysis
The LLM returns structured analysis with confidence scores:Output Files
The agent saves detailed results:Budget Management
Setting Budget Limits
Cost Estimates
Per Vulnerability Analysis:- Context extraction: ~2K tokens
- Deep validation: ~4K tokens
- Exploit generation: ~6K tokens
- Patch generation: ~4K tokens
Best Practices
Use Dataflow-Enabled Scanners
Use Dataflow-Enabled Scanners
Scanners with dataflow support (CodeQL, Semgrep) provide much better context:
- Source/sink tracking
- Sanitizer detection
- Intermediate transformation steps
Prioritize High-Severity Findings
Prioritize High-Severity Findings
Start with high-severity findings to maximize ROI:
Review LLM Decisions
Review LLM Decisions
The LLM is very good but not infallible:
- Check false positive determinations
- Verify exploitability assessments
- Review generated exploit code
- Test patches before deploying
Use Frontier Models for Exploits
Use Frontier Models for Exploits
Local models work well for analysis but struggle with exploit generation:
- Analysis/patching: Ollama is fine
- Exploit generation: Use Claude or GPT-4
- Critical findings: Always use frontier models
Example: End-to-End Analysis
See Also
Exploit Generation
Generate working exploit PoCs
Patch Creation
Create secure patches
Crash Analysis
Analyze binary crashes