Overview
The/codeql command performs deep static analysis using GitHub’s CodeQL engine with dataflow and taint tracking validation. It’s slower than Semgrep but finds complex vulnerabilities that pattern-based scanners miss.
Syntax
Parameters
Absolute path to the code repository to analyze
Programming language (auto-detected if not specified)
Maximum number of findings to report (default: unlimited)
What It Does
- Creates CodeQL database from source code
- Runs security and quality queries
- Performs dataflow and taint analysis
- Validates source-to-sink paths
- Generates SARIF output with detailed findings
- Saves results to
out/directory
When to Use CodeQL
Use CodeQL When:
- Looking for complex dataflow vulnerabilities
- Need to trace data from source to sink
- Analyzing security-critical codebases
- Semgrep produces too many false positives
- Need high-confidence findings
Use Semgrep When:
- Need fast results
- Checking for common patterns
- Running in CI/CD pipelines
- Performing quick audits
Examples
Basic CodeQL Analysis
Specific Language
Limited Findings
Supported Languages
- C/C++: Buffer overflows, use-after-free, format strings
- Java: SQL injection, XSS, deserialization
- JavaScript/TypeScript: Prototype pollution, code injection
- Python: Command injection, path traversal, SQL injection
- C#: LDAP injection, XXE, insecure deserialization
- Go: SQL injection, command injection, path traversal
- Ruby: Code injection, SQL injection, SSRF
Vulnerability Classes Detected
Injection Vulnerabilities
- SQL injection
- Command injection
- LDAP injection
- XPath injection
- Code injection
Dataflow Issues
- Tainted path traversal
- Server-side request forgery (SSRF)
- Cross-site scripting (XSS)
- XML external entity (XXE)
Memory Safety
- Buffer overflows
- Use-after-free
- Double free
- Memory leaks
Cryptographic Issues
- Weak encryption algorithms
- Insecure random number generation
- Hard-coded credentials
Output Structure
Performance Characteristics
| Metric | CodeQL | Semgrep |
|---|---|---|
| Speed | Slow (5-30 min) | Fast (30-120 sec) |
| Accuracy | High | Medium |
| False Positives | Low | Higher |
| Dataflow Analysis | Yes | Limited |
| Database Size | Large (GBs) | None |
Use Cases
- Security-critical application audits
- Finding complex vulnerabilities
- Validating Semgrep findings
- Research on dataflow vulnerabilities
- High-assurance security reviews
Advanced Features
Dataflow Analysis
CodeQL tracks data from sources (user input) to sinks (dangerous operations):Custom Queries
CodeQL supports custom security queries for domain-specific checks:Related Commands
/scan
Fast Semgrep scanning
/agentic
Full workflow with both Semgrep and CodeQL
/validate
Validate findings exploitability
/analyze
LLM analysis of CodeQL results
Notes
- CodeQL analysis is slower but finds complex issues
- Requires significant disk space for databases
- Best used for thorough security audits
- Combines well with Semgrep in
/agenticmode - Results are high-confidence and low false-positive