/codeql

Overview

The /codeql command performs deep static analysis using GitHub’s CodeQL engine with dataflow and taint tracking validation. It’s slower than Semgrep but finds complex vulnerabilities that pattern-based scanners miss.

Syntax

python3 raptor.py codeql --repo <path> [options]

Parameters

repo

string

required

Absolute path to the code repository to analyze

language

string

Programming language (auto-detected if not specified)

max-findings

integer

Maximum number of findings to report (default: unlimited)

What It Does

Creates CodeQL database from source code
Runs security and quality queries
Performs dataflow and taint analysis
Validates source-to-sink paths
Generates SARIF output with detailed findings
Saves results to out/ directory

When to Use CodeQL

Use CodeQL When:

Looking for complex dataflow vulnerabilities
Need to trace data from source to sink
Analyzing security-critical codebases
Semgrep produces too many false positives
Need high-confidence findings

Use Semgrep When:

Need fast results
Checking for common patterns
Running in CI/CD pipelines
Performing quick audits

Examples

Basic CodeQL Analysis

python3 raptor.py codeql --repo /path/to/code

Runs full CodeQL analysis with dataflow validation.

Specific Language

python3 raptor.py codeql --repo /path/to/code --language python

Analyzes Python code only.

Limited Findings

python3 raptor.py codeql --repo /path/to/code --max-findings 20

Reports only the first 20 findings.

Supported Languages

C/C++: Buffer overflows, use-after-free, format strings
Java: SQL injection, XSS, deserialization
JavaScript/TypeScript: Prototype pollution, code injection
Python: Command injection, path traversal, SQL injection
C#: LDAP injection, XXE, insecure deserialization
Go: SQL injection, command injection, path traversal
Ruby: Code injection, SQL injection, SSRF

Vulnerability Classes Detected

Injection Vulnerabilities

SQL injection
Command injection
LDAP injection
XPath injection
Code injection

Dataflow Issues

Tainted path traversal
Server-side request forgery (SSRF)
Cross-site scripting (XSS)
XML external entity (XXE)

Memory Safety

Buffer overflows
Use-after-free
Double free
Memory leaks

Cryptographic Issues

Weak encryption algorithms
Insecure random number generation
Hard-coded credentials

Output Structure

out/codeql_<timestamp>/
├── database/              # CodeQL database
├── findings.sarif        # SARIF format results
├── report.md             # Human-readable report
└── dataflow-paths.json   # Source-to-sink traces

Performance Characteristics

Metric	CodeQL	Semgrep
Speed	Slow (5-30 min)	Fast (30-120 sec)
Accuracy	High	Medium
False Positives	Low	Higher
Dataflow Analysis	Yes	Limited
Database Size	Large (GBs)	None

Use Cases

Security-critical application audits
Finding complex vulnerabilities
Validating Semgrep findings
Research on dataflow vulnerabilities
High-assurance security reviews

Advanced Features

Dataflow Analysis

CodeQL tracks data from sources (user input) to sinks (dangerous operations):

# CodeQL can trace this flow:
user_input = request.GET['file']  # Source
path = os.path.join('/data', user_input)  # Taint propagation
with open(path) as f:  # Sink - path traversal detected!
    return f.read()

Custom Queries

CodeQL supports custom security queries for domain-specific checks:

import python

from Call call, Expr arg
where
  call.getFunc().(Name).getId() = "eval" and
  arg = call.getArg(0) and
  arg.getAFlowSource() instanceof ExternalInput
select call, "Dangerous eval with user input"

/scan

Fast Semgrep scanning

/agentic

Full workflow with both Semgrep and CodeQL

/validate

Validate findings exploitability

/analyze

LLM analysis of CodeQL results

Notes

CodeQL analysis is slower but finds complex issues
Requires significant disk space for databases
Best used for thorough security audits
Combines well with Semgrep in /agentic mode
Results are high-confidence and low false-positive

Commands

Packages

Agents

Expert Personas

Overview

Syntax

Parameters

What It Does

When to Use CodeQL

Use CodeQL When:

Use Semgrep When:

Examples

Basic CodeQL Analysis

Specific Language

Limited Findings

Supported Languages

Vulnerability Classes Detected

Injection Vulnerabilities

Dataflow Issues

Memory Safety

Cryptographic Issues

Output Structure

Performance Characteristics

Use Cases

Advanced Features

Dataflow Analysis

Custom Queries

/scan

/agentic

/validate

/analyze

Notes

Build docs developers (and LLMs) love

Commands

Packages

Agents

Expert Personas

​Overview

​Syntax

​Parameters

​What It Does

​When to Use CodeQL

​Use CodeQL When:

​Use Semgrep When:

​Examples

​Basic CodeQL Analysis

​Specific Language

​Limited Findings

​Supported Languages

​Vulnerability Classes Detected

​Injection Vulnerabilities

​Dataflow Issues

​Memory Safety

​Cryptographic Issues

​Output Structure

​Performance Characteristics

​Use Cases

​Advanced Features

​Dataflow Analysis

​Custom Queries

​Related Commands

/scan

/agentic

/validate

/analyze

​Notes

Build docs developers (and LLMs) love

Overview

Syntax

Parameters

What It Does

When to Use CodeQL

Use CodeQL When:

Use Semgrep When:

Examples

Basic CodeQL Analysis

Specific Language

Limited Findings

Supported Languages

Vulnerability Classes Detected

Injection Vulnerabilities

Dataflow Issues

Memory Safety

Cryptographic Issues

Output Structure

Performance Characteristics

Use Cases

Advanced Features

Dataflow Analysis

Custom Queries

Related Commands

Notes