Skip to main content

Overview

The LLM Analysis package provides true agentic security analysis using large language models. Unlike template-based tools, it reasons about vulnerabilities contextually, generates working exploits, and creates intelligent patches.

Purpose

AI-powered autonomous security with:
  • LLM-powered analysis: No heuristics, genuine reasoning
  • Context-aware exploits: Generated from actual code, not templates
  • Intelligent patching: Understands security context
  • Multi-model support: Claude, GPT-4, Ollama (DeepSeek/Qwen)
  • Automatic fallback: Cost optimization and reliability

Architecture

packages/llm_analysis/
├── agent.py                # Autonomous security agent
├── crash_agent.py          # Crash analysis agent
├── orchestrator.py         # Workflow orchestration
└── llm/
    ├── client.py           # LLM client with fallback
    ├── config.py           # Model configuration
    └── providers.py        # Provider implementations

Quick Start

Analyze Findings

# Analyze SARIF findings with LLM
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --max-findings 10

Generate Exploits

# Analyze + generate exploits for exploitable findings
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --generate-exploits

Create Patches

# Analyze + create patches
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --generate-patches

Python API

Autonomous Security Agent

from pathlib import Path
from packages.llm_analysis import AutonomousSecurityAgentV2

# Initialize agent
agent = AutonomousSecurityAgentV2(
    repo_path=Path("/path/to/code"),
    out_dir=Path("out/analysis")
)

# Analyze SARIF findings
results = agent.analyze_sarif(
    sarif_path=Path("out/combined.sarif"),
    max_findings=10,
    generate_exploits=True,
    generate_patches=True
)

# Review results
for result in results:
    print(f"Finding: {result['finding_id']}")
    print(f"Exploitable: {result['exploitable']}")
    print(f"Score: {result['exploitability_score']}")
    if result['exploit_code']:
        print(f"Exploit: {result['exploit_code'][:100]}...")

Analyze Single Vulnerability

from packages.llm_analysis.agent import VulnerabilityContext

# Create vulnerability context
context = VulnerabilityContext(
    finding={
        "finding_id": "sqli-001",
        "rule_id": "sql-injection",
        "file": "src/api/users.py",
        "startLine": 45,
        "endLine": 47,
        "message": "SQL injection vulnerability",
        "snippet": 'query = f"SELECT * FROM users WHERE id={user_id}"'
    },
    repo_path=Path("/path/to/code")
)

# Read vulnerable code
context.read_vulnerable_code()

# Analyze with LLM
analysis = agent.analyze_vulnerability(context)

print(f"Exploitable: {context.exploitable}")
print(f"Analysis: {context.analysis}")

LLM Client

from packages.llm_analysis.llm import LLMClient, LLMConfig

# Initialize with multi-model support
config = LLMConfig(
    primary_model="claude-3-7-sonnet-20250219",
    fallback_model="gpt-4o",
    enable_local_fallback=True,
    local_model="deepseek-r1:14b"
)

client = LLMClient(config)

# Query with automatic fallback
response = client.query(
    system_prompt="You are a security analyst.",
    user_prompt="Analyze this SQL injection vulnerability...",
    temperature=0.3
)

print(response['content'])
print(f"Model used: {response['model']}")
print(f"Cost: ${response['cost']:.4f}")

Core Classes

AutonomousSecurityAgentV2

Main agent for vulnerability analysis.
class AutonomousSecurityAgentV2:
    def __init__(
        self,
        repo_path: Path,
        out_dir: Path,
        llm_config: Optional[LLMConfig] = None
    )
    
    def analyze_sarif(
        self,
        sarif_path: Path,
        max_findings: int = 10,
        generate_exploits: bool = False,
        generate_patches: bool = False
    ) -> List[Dict[str, Any]]
    
    def analyze_vulnerability(
        self,
        context: VulnerabilityContext
    ) -> Dict[str, Any]
    
    def generate_exploit(
        self,
        context: VulnerabilityContext
    ) -> Optional[str]
    
    def generate_patch(
        self,
        context: VulnerabilityContext
    ) -> Optional[str]

VulnerabilityContext

Complete context for vulnerability analysis.
finding
Dict[str, Any]
required
SARIF finding data
repo_path
Path
required
Repository path for reading source code
exploitable
bool
Whether vulnerability is exploitable
exploitability_score
float
Score from 0.0 to 1.0
exploit_code
Optional[str]
Generated exploit code
patch_code
Optional[str]
Generated patch code
analysis
Dict[str, Any]
Detailed LLM analysis

LLMClient

Multi-model LLM client with fallback.
class LLMClient:
    def __init__(self, config: LLMConfig = None)
    
    def query(
        self,
        system_prompt: str,
        user_prompt: str,
        temperature: float = 0.3,
        max_tokens: int = 4000
    ) -> Dict[str, Any]
    
    def query_with_fallback(
        self,
        system_prompt: str,
        user_prompt: str,
        **kwargs
    ) -> Dict[str, Any]
content
str
LLM response text
model
str
Model that generated response
cost
float
Estimated cost in USD
tokens
Dict[str, int]
Token usage (prompt, completion, total)

LLMConfig

Configuration for multi-model setup.
primary_model
str
default:"claude-3-7-sonnet-20250219"
Primary model to use
fallback_model
str
default:"gpt-4o"
Fallback if primary fails
enable_local_fallback
bool
default:"True"
Enable local model fallback (Ollama)
local_model
str
default:"deepseek-r1:14b"
Local model name for Ollama
max_retries
int
default:"3"
Max retry attempts per model

Supported Models

Cloud Models

ProviderModelContextCost (per 1M tokens)
Anthropicclaude-3-7-sonnet-20250219200K3.00/3.00 / 15.00
Anthropicclaude-3-5-sonnet-20241022200K3.00/3.00 / 15.00
OpenAIgpt-4o128K2.50/2.50 / 10.00
OpenAIgpt-4o-mini128K0.15/0.15 / 0.60

Local Models (Ollama)

ModelSizePerformance
deepseek-r1:14b14BExcellent reasoning
qwen2.5:14b14BGood general purpose
qwen2.5-coder:14b14BCode specialized

Configuration

Environment Variables

# API Keys
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

# Ollama (local)
export OLLAMA_BASE_URL=http://localhost:11434

# Model selection
export LLM_PRIMARY_MODEL=claude-3-7-sonnet-20250219
export LLM_FALLBACK_MODEL=gpt-4o
export LLM_LOCAL_MODEL=deepseek-r1:14b

Model Selection Strategy

  1. Try primary model (Claude 3.7 Sonnet)
  2. If fails, try fallback (GPT-4o)
  3. If both fail, try local (DeepSeek R1)
  4. Retry each model up to 3 times

Analysis Output

Vulnerability Analysis

{
  "finding_id": "sqli-001",
  "rule_id": "sql-injection",
  "file": "src/api/users.py",
  "startLine": 45,
  "exploitable": true,
  "exploitability_score": 0.95,
  "analysis": {
    "vulnerability_type": "SQL Injection",
    "severity": "critical",
    "attack_vector": "Network",
    "attack_complexity": "Low",
    "privileges_required": "None",
    "reasoning": "User input directly interpolated into SQL query...",
    "exploitation_difficulty": "Easy",
    "impact": "Complete database compromise"
  }
}

Exploit Generation

# Generated exploit code
exploit = """
import requests

# SQL injection exploit
url = "http://target.com/api/users"
payload = "1' OR '1'='1' UNION SELECT username,password FROM users--"

response = requests.get(url, params={'id': payload})
print(response.json())
"""

Patch Generation

# Generated patch
patch = """
# Before (vulnerable)
query = f"SELECT * FROM users WHERE id={user_id}"

# After (patched)
from sqlalchemy import text
query = text("SELECT * FROM users WHERE id=:user_id")
result = db.execute(query, {'user_id': user_id})
"""

Dataflow Analysis

The agent supports advanced dataflow analysis from CodeQL:
# Vulnerability with dataflow
context = VulnerabilityContext(
    finding={
        "has_dataflow": True,
        "dataflow_path": {
            "source": {
                "file": "src/api/routes.py",
                "line": 23,
                "message": "User input from request.args"
            },
            "sink": {
                "file": "src/db/queries.py",
                "line": 45,
                "message": "SQL query execution"
            },
            "steps": [
                {"file": "src/api/routes.py", "line": 25},
                {"file": "src/api/validation.py", "line": 12},
                {"file": "src/db/queries.py", "line": 43}
            ]
        }
    },
    repo_path=Path("/path/to/code")
)

# Agent will analyze the complete dataflow path
analysis = agent.analyze_vulnerability(context)

Integration

With Static Analysis

from packages.static_analysis import main as scan_repo
from packages.llm_analysis import AutonomousSecurityAgentV2

# 1. Scan repository
scan_repo()  # Generates SARIF

# 2. Analyze with LLM
agent = AutonomousSecurityAgentV2(
    repo_path=Path("/path/to/code"),
    out_dir=Path("out/analysis")
)

results = agent.analyze_sarif(
    sarif_path=Path("out/combined.sarif"),
    generate_exploits=True,
    generate_patches=True
)

With CodeQL

from packages.codeql import CodeQLAgent
from packages.llm_analysis import AutonomousSecurityAgentV2

# 1. Run CodeQL
codeql = CodeQLAgent(repo_path=Path("/path/to/code"))
result = codeql.run()

# 2. Analyze CodeQL findings with LLM
for sarif_file in result.sarif_files:
    agent.analyze_sarif(sarif_file, generate_exploits=True)

Performance

Analysis Speed

  • Per finding: 10-30 seconds (depends on model)
  • Batch (10 findings): 3-5 minutes
  • With exploits: +20-40 seconds per exploitable finding

Cost Estimates

  • Claude 3.7 Sonnet: ~$0.05-0.15 per finding
  • GPT-4o: ~$0.03-0.10 per finding
  • Local (Ollama): $0.00 (free, but slower)

Best Practices

  1. Start with max_findings=10 for initial assessment
  2. Enable exploits for critical findings only
  3. Use local models for cost-free experimentation
  4. Review patches before applying (AI can make mistakes)
  5. Combine with dataflow analysis for best results

Build docs developers (and LLMs) love