LLM Analysis

Overview

The LLM Analysis package provides true agentic security analysis using large language models. Unlike template-based tools, it reasons about vulnerabilities contextually, generates working exploits, and creates intelligent patches.

Purpose

AI-powered autonomous security with:

LLM-powered analysis: No heuristics, genuine reasoning
Context-aware exploits: Generated from actual code, not templates
Intelligent patching: Understands security context
Multi-model support: Claude, GPT-4, Ollama (DeepSeek/Qwen)
Automatic fallback: Cost optimization and reliability

Architecture

packages/llm_analysis/
├── agent.py                # Autonomous security agent
├── crash_agent.py          # Crash analysis agent
├── orchestrator.py         # Workflow orchestration
└── llm/
    ├── client.py           # LLM client with fallback
    ├── config.py           # Model configuration
    └── providers.py        # Provider implementations

Quick Start

Analyze Findings

# Analyze SARIF findings with LLM
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --max-findings 10

Generate Exploits

# Analyze + generate exploits for exploitable findings
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --generate-exploits

Create Patches

# Analyze + create patches
python3 -m packages.llm_analysis.agent \
  --repo /path/to/code \
  --sarif out/combined.sarif \
  --generate-patches

Python API

Autonomous Security Agent

from pathlib import Path
from packages.llm_analysis import AutonomousSecurityAgentV2

# Initialize agent
agent = AutonomousSecurityAgentV2(
    repo_path=Path("/path/to/code"),
    out_dir=Path("out/analysis")
)

# Analyze SARIF findings
results = agent.analyze_sarif(
    sarif_path=Path("out/combined.sarif"),
    max_findings=10,
    generate_exploits=True,
    generate_patches=True
)

# Review results
for result in results:
    print(f"Finding: {result['finding_id']}")
    print(f"Exploitable: {result['exploitable']}")
    print(f"Score: {result['exploitability_score']}")
    if result['exploit_code']:
        print(f"Exploit: {result['exploit_code'][:100]}...")

Analyze Single Vulnerability

from packages.llm_analysis.agent import VulnerabilityContext

# Create vulnerability context
context = VulnerabilityContext(
    finding={
        "finding_id": "sqli-001",
        "rule_id": "sql-injection",
        "file": "src/api/users.py",
        "startLine": 45,
        "endLine": 47,
        "message": "SQL injection vulnerability",
        "snippet": 'query = f"SELECT * FROM users WHERE id={user_id}"'
    },
    repo_path=Path("/path/to/code")
)

# Read vulnerable code
context.read_vulnerable_code()

# Analyze with LLM
analysis = agent.analyze_vulnerability(context)

print(f"Exploitable: {context.exploitable}")
print(f"Analysis: {context.analysis}")

LLM Client

from packages.llm_analysis.llm import LLMClient, LLMConfig

# Initialize with multi-model support
config = LLMConfig(
    primary_model="claude-3-7-sonnet-20250219",
    fallback_model="gpt-4o",
    enable_local_fallback=True,
    local_model="deepseek-r1:14b"
)

client = LLMClient(config)

# Query with automatic fallback
response = client.query(
    system_prompt="You are a security analyst.",
    user_prompt="Analyze this SQL injection vulnerability...",
    temperature=0.3
)

print(response['content'])
print(f"Model used: {response['model']}")
print(f"Cost: ${response['cost']:.4f}")

Core Classes

AutonomousSecurityAgentV2

Main agent for vulnerability analysis.

class AutonomousSecurityAgentV2:
    def __init__(
        self,
        repo_path: Path,
        out_dir: Path,
        llm_config: Optional[LLMConfig] = None
    )
    
    def analyze_sarif(
        self,
        sarif_path: Path,
        max_findings: int = 10,
        generate_exploits: bool = False,
        generate_patches: bool = False
    ) -> List[Dict[str, Any]]
    
    def analyze_vulnerability(
        self,
        context: VulnerabilityContext
    ) -> Dict[str, Any]
    
    def generate_exploit(
        self,
        context: VulnerabilityContext
    ) -> Optional[str]
    
    def generate_patch(
        self,
        context: VulnerabilityContext
    ) -> Optional[str]

VulnerabilityContext

Complete context for vulnerability analysis.

finding

Dict[str, Any]

required

SARIF finding data

repo_path

Path

required

Repository path for reading source code

exploitable

bool

Whether vulnerability is exploitable

exploitability_score

float

Score from 0.0 to 1.0

exploit_code

Optional[str]

Generated exploit code

patch_code

Optional[str]

Generated patch code

analysis

Dict[str, Any]

Detailed LLM analysis

LLMClient

Multi-model LLM client with fallback.

class LLMClient:
    def __init__(self, config: LLMConfig = None)
    
    def query(
        self,
        system_prompt: str,
        user_prompt: str,
        temperature: float = 0.3,
        max_tokens: int = 4000
    ) -> Dict[str, Any]
    
    def query_with_fallback(
        self,
        system_prompt: str,
        user_prompt: str,
        **kwargs
    ) -> Dict[str, Any]

content

str

LLM response text

model

str

Model that generated response

cost

float

Estimated cost in USD

tokens

Dict[str, int]

Token usage (prompt, completion, total)

LLMConfig

Configuration for multi-model setup.

primary_model

str

default:"claude-3-7-sonnet-20250219"

Primary model to use

fallback_model

str

default:"gpt-4o"

Fallback if primary fails

enable_local_fallback

bool

default:"True"

Enable local model fallback (Ollama)

local_model

str

default:"deepseek-r1:14b"

Local model name for Ollama

max_retries

int

default:"3"

Max retry attempts per model

Supported Models

Cloud Models

Provider	Model	Context	Cost (per 1M tokens)
Anthropic	claude-3-7-sonnet-20250219	200K	$3.00 /$ 15.00
Anthropic	claude-3-5-sonnet-20241022	200K	$3.00 /$ 15.00
OpenAI	gpt-4o	128K	$2.50 /$ 10.00
OpenAI	gpt-4o-mini	128K	$0.15 /$ 0.60

Local Models (Ollama)

Model	Size	Performance
deepseek-r1:14b	14B	Excellent reasoning
qwen2.5:14b	14B	Good general purpose
qwen2.5-coder:14b	14B	Code specialized

Configuration

Environment Variables

# API Keys
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

# Ollama (local)
export OLLAMA_BASE_URL=http://localhost:11434

# Model selection
export LLM_PRIMARY_MODEL=claude-3-7-sonnet-20250219
export LLM_FALLBACK_MODEL=gpt-4o
export LLM_LOCAL_MODEL=deepseek-r1:14b

Model Selection Strategy

Try primary model (Claude 3.7 Sonnet)
If fails, try fallback (GPT-4o)
If both fail, try local (DeepSeek R1)
Retry each model up to 3 times

Analysis Output

Vulnerability Analysis

{
  "finding_id": "sqli-001",
  "rule_id": "sql-injection",
  "file": "src/api/users.py",
  "startLine": 45,
  "exploitable": true,
  "exploitability_score": 0.95,
  "analysis": {
    "vulnerability_type": "SQL Injection",
    "severity": "critical",
    "attack_vector": "Network",
    "attack_complexity": "Low",
    "privileges_required": "None",
    "reasoning": "User input directly interpolated into SQL query...",
    "exploitation_difficulty": "Easy",
    "impact": "Complete database compromise"
  }
}

Exploit Generation

# Generated exploit code
exploit = """
import requests

# SQL injection exploit
url = "http://target.com/api/users"
payload = "1' OR '1'='1' UNION SELECT username,password FROM users--"

response = requests.get(url, params={'id': payload})
print(response.json())
"""

Patch Generation

# Generated patch
patch = """
# Before (vulnerable)
query = f"SELECT * FROM users WHERE id={user_id}"

# After (patched)
from sqlalchemy import text
query = text("SELECT * FROM users WHERE id=:user_id")
result = db.execute(query, {'user_id': user_id})
"""

Dataflow Analysis

The agent supports advanced dataflow analysis from CodeQL:

# Vulnerability with dataflow
context = VulnerabilityContext(
    finding={
        "has_dataflow": True,
        "dataflow_path": {
            "source": {
                "file": "src/api/routes.py",
                "line": 23,
                "message": "User input from request.args"
            },
            "sink": {
                "file": "src/db/queries.py",
                "line": 45,
                "message": "SQL query execution"
            },
            "steps": [
                {"file": "src/api/routes.py", "line": 25},
                {"file": "src/api/validation.py", "line": 12},
                {"file": "src/db/queries.py", "line": 43}
            ]
        }
    },
    repo_path=Path("/path/to/code")
)

# Agent will analyze the complete dataflow path
analysis = agent.analyze_vulnerability(context)

Integration

With Static Analysis

from packages.static_analysis import main as scan_repo
from packages.llm_analysis import AutonomousSecurityAgentV2

# 1. Scan repository
scan_repo()  # Generates SARIF

# 2. Analyze with LLM
agent = AutonomousSecurityAgentV2(
    repo_path=Path("/path/to/code"),
    out_dir=Path("out/analysis")
)

results = agent.analyze_sarif(
    sarif_path=Path("out/combined.sarif"),
    generate_exploits=True,
    generate_patches=True
)

With CodeQL

from packages.codeql import CodeQLAgent
from packages.llm_analysis import AutonomousSecurityAgentV2

# 1. Run CodeQL
codeql = CodeQLAgent(repo_path=Path("/path/to/code"))
result = codeql.run()

# 2. Analyze CodeQL findings with LLM
for sarif_file in result.sarif_files:
    agent.analyze_sarif(sarif_file, generate_exploits=True)

Static Analysis - Semgrep scanning
CodeQL - Semantic analysis
Exploitability Validation - Validate findings
Autonomous - Multi-turn reasoning

Performance

Analysis Speed

Per finding: 10-30 seconds (depends on model)
Batch (10 findings): 3-5 minutes
With exploits: +20-40 seconds per exploitable finding

Cost Estimates

Claude 3.7 Sonnet: ~$0.05-0.15 per finding
GPT-4o: ~$0.03-0.10 per finding
Local (Ollama): $0.00 (free, but slower)

Best Practices

Start with max_findings=10 for initial assessment
Enable exploits for critical findings only
Use local models for cost-free experimentation
Review patches before applying (AI can make mistakes)
Combine with dataflow analysis for best results

Commands

Packages

Agents

Expert Personas

Overview

Purpose

Architecture

Quick Start

Analyze Findings

Generate Exploits

Create Patches

Python API

Autonomous Security Agent

Analyze Single Vulnerability

LLM Client

Core Classes

AutonomousSecurityAgentV2

VulnerabilityContext

LLMClient

LLMConfig

Supported Models

Cloud Models

Local Models (Ollama)

Configuration

Environment Variables

Model Selection Strategy

Analysis Output

Vulnerability Analysis

Exploit Generation

Patch Generation

Dataflow Analysis

Integration

With Static Analysis

With CodeQL

Performance

Analysis Speed

Cost Estimates

Best Practices

Build docs developers (and LLMs) love

Commands

Packages

Agents

Expert Personas

​Overview

​Purpose

​Architecture

​Quick Start

​Analyze Findings

​Generate Exploits

​Create Patches

​Python API

​Autonomous Security Agent

​Analyze Single Vulnerability

​LLM Client

​Core Classes

​AutonomousSecurityAgentV2

​VulnerabilityContext

​LLMClient

​LLMConfig

​Supported Models

​Cloud Models

​Local Models (Ollama)

​Configuration

​Environment Variables

​Model Selection Strategy

​Analysis Output

​Vulnerability Analysis

​Exploit Generation

​Patch Generation

​Dataflow Analysis

​Integration

​With Static Analysis

​With CodeQL

​Related Packages

​Performance

​Analysis Speed

​Cost Estimates

​Best Practices

Build docs developers (and LLMs) love

Overview

Purpose

Architecture

Quick Start

Analyze Findings

Generate Exploits

Create Patches

Python API

Autonomous Security Agent

Analyze Single Vulnerability

LLM Client

Core Classes

AutonomousSecurityAgentV2

VulnerabilityContext

LLMClient

LLMConfig

Supported Models

Cloud Models

Local Models (Ollama)

Configuration

Environment Variables

Model Selection Strategy

Analysis Output

Vulnerability Analysis

Exploit Generation

Patch Generation

Dataflow Analysis

Integration

With Static Analysis

With CodeQL

Related Packages

Performance

Analysis Speed

Cost Estimates

Best Practices