Control-Flow Graph (CFG)

angr provides two complementary approaches to control-flow graph (CFG) recovery: CFGFast for fast static analysis and CFGEmulated for precise dynamic analysis through symbolic execution.

Overview

A Control-Flow Graph represents the structure of a program as a graph where:

Nodes are basic blocks (sequences of instructions with single entry/exit)
Edges represent control flow (jumps, calls, returns)

By default, angr loads shared libraries. For most CFG analyses, you should disable this:

p = angr.Project('/bin/ls', load_options={'auto_load_libs': False})

CFGFast: Static Analysis

CFGFast uses static analysis to quickly recover the control-flow graph without executing code.

Basic Usage

import angr

# Load the binary
p = angr.Project('/bin/ls', load_options={'auto_load_libs': False})

# Generate CFG
cfg = p.analyses.CFGFast()

print(f"Found {len(cfg.graph.nodes())} nodes")
print(f"Found {len(cfg.graph.edges())} edges")
print(f"Found {len(cfg.kb.functions)} functions")

Key Features

Function Detection

CFGFast identifies functions through:

Binary symbols (for non-stripped binaries)
Function prologues (architecture-specific patterns)
Call targets
Entry point analysis

Control Flow Recovery

Lifts basic blocks to VEX IR
Analyzes direct jumps and calls
Resolves indirect jumps using jump table analysis
Handles FakeRet edges for call returns

Iterative Refinement

Initial pass assumes all functions return
Second pass updates based on actual function behavior
Removes FakeRet edges for non-returning functions

Configuration Options

cfg = p.analyses.CFGFast(
    # Start analysis from specific addresses
    function_starts=[0x400000, 0x400100],
    
    # Force complete binary scan (disable for blobs)
    force_complete_scan=True,
    
    # Normalize the CFG (each block belongs to one function)
    normalize=False,
    
    # Resolve indirect jumps
    resolve_indirect_jumps=True,
    
    # Collect data references
    data_references=True,
    
    # Cross-references analysis
    cross_references=True,
)

Accessing CFG Data

# Get the NetworkX graph
graph = cfg.graph

# Get any node at an address
node = cfg.model.get_any_node(0x400000)

# Get all nodes at an address (multiple contexts)
nodes = cfg.model.get_all_nodes(0x400000)

# Iterate over nodes
for node in graph.nodes():
    print(f"Block at {node.addr:#x}, size {node.size}")

# Get successors and predecessors
for succ in graph.successors(node):
    print(f"Successor: {succ.addr:#x}")

for pred in graph.predecessors(node):
    print(f"Predecessor: {pred.addr:#x}")

# Get edges with jump kinds
for src, dst, data in graph.edges(data=True):
    jumpkind = data.get('jumpkind', 'Ijk_Boring')
    print(f"{src.addr:#x} --[{jumpkind}]--> {dst.addr:#x}")

CFGEmulated: Dynamic Analysis

CFGEmulated uses symbolic execution to build a more accurate CFG by actually executing the program.

Basic Usage

# Generate dynamic CFG
cfg = p.analyses.CFGEmulated(
    # Keep states at each node (memory intensive)
    keep_state=True,
    
    # Context sensitivity level (0-infinity)
    context_sensitivity_level=1,
    
    # Starting addresses
    starts=[p.entry],
)

# Access states if keep_state=True
for node in cfg.graph.nodes():
    if node.state is not None:
        print(f"State at {node.addr:#x}")

Context Sensitivity

Context sensitivity determines how many call stack frames to track:

Level 0
Level 1
Level 2+

Callee-only: Each function analyzed once

cfg = p.analyses.CFGEmulated(context_sensitivity_level=0)
# Fast but less precise

One caller + callee: Distinguish different calling contexts

cfg = p.analyses.CFGEmulated(context_sensitivity_level=1)
# Default: good balance of speed/precision

Multiple callers: Track deeper call chains

cfg = p.analyses.CFGEmulated(context_sensitivity_level=2)
# More precise but exponentially slower

Advanced Options

cfg = p.analyses.CFGEmulated(
    # Limit analysis depth
    call_depth=5,
    
    # Filter which calls to trace
    call_tracing_filter=my_filter_func,
    
    # Custom initial state
    initial_state=my_state,
    
    # Addresses to avoid
    avoid_runs=[0x400500, 0x400600],
    
    # Add state options
    state_add_options={angr.options.TRACK_MEMORY_ACTIONS},
    
    # Resolve indirect jumps
    resolve_indirect_jumps=True,
    
    # Normalize the result
    normalize=True,
)

Function Manager

Both CFG analyses populate the Function Manager in the knowledge base:

# Access functions
funcs = cfg.kb.functions

# Get function by address
main = funcs[p.entry]

# Function properties
print(f"Name: {main.name}")
print(f"Returns: {main.returning}")
print(f"Blocks: {len(main.block_addrs)}")
print(f"Calls: {len(main.get_call_sites())}")

# String references
for addr, string in main.string_references():
    print(f"String at {addr:#x}: {string}")

# Iterate blocks
for block in main.blocks:
    print(f"Block at {block.addr:#x}:")
    block.pp()  # Pretty print

# Get call targets
for callsite in main.get_call_sites():
    target = main.get_call_target(callsite)
    print(f"Call at {callsite:#x} -> {target:#x}")

# Function transition graph
func_graph = main.transition_graph
for node in func_graph.nodes():
    print(f"Node in function: {node.addr:#x}")

CFGNode Properties

node = cfg.model.get_any_node(0x400000)

# Basic properties
print(f"Address: {node.addr:#x}")
print(f"Size: {node.size}")
print(f"Name: {node.name}")
print(f"Function: {node.function_address:#x}")

# Graph navigation
print(f"Predecessors: {node.predecessors}")
print(f"Successors: {node.successors}")

# Block access
if node.block is not None:
    node.block.pp()

# For CFGEmulated with keep_state=True
if hasattr(node, 'state') and node.state:
    print(f"State: {node.state}")

Indirect Jump Resolution

angr includes resolvers for common indirect jump patterns:

# CFGFast automatically resolves jump tables
cfg = p.analyses.CFGFast(resolve_indirect_jumps=True)

# Access resolved targets
for addr, jump in cfg.indirect_jumps.items():
    print(f"Indirect jump at {addr:#x}")
    print(f"Type: {jump.type}")
    if jump.resolved_targets:
        print(f"Targets: {[hex(t) for t in jump.resolved_targets]}")

Comparing CFGFast vs CFGEmulated

Feature	CFGFast	CFGEmulated
Speed	⚡ Very Fast (seconds)	🐌 Slow (minutes to hours)
Accuracy	Good for most binaries	Higher (with caveats)
Indirect Jumps	Jump table analysis	Symbolic execution
State Tracking	❌ No	✅ Optional
Best For	Initial analysis, stripped binaries	Precise analysis, small functions

Recommendation: Start with CFGFast for most use cases. Only use CFGEmulated when you need:

State information at each basic block
Very precise control flow for a specific function
Analysis of self-modifying code

Example: Full CFG Analysis

import angr
import networkx as nx

# Load binary
p = angr.Project('/bin/ls', load_options={'auto_load_libs': False})

# Generate CFG
cfg = p.analyses.CFGFast(normalize=True)

print("=== CFG Statistics ===")
print(f"Nodes: {len(cfg.graph.nodes())}")
print(f"Edges: {len(cfg.graph.edges())}")
print(f"Functions: {len(cfg.kb.functions)}")

# Analyze main function
main = cfg.kb.functions['main']
print(f"\n=== Main Function ===")
print(f"Address: {main.addr:#x}")
print(f"Blocks: {len(main.block_addrs)}")
print(f"Returns: {main.returning}")

# Find longest path
paths = nx.all_simple_paths(cfg.graph, 
                            source=cfg.model.get_any_node(main.addr),
                            target=None)
max_path = max(paths, key=len, default=[])
print(f"Longest path: {len(max_path)} blocks")

# Find indirect jumps
print(f"\n=== Indirect Jumps ===")
for addr, jump in cfg.indirect_jumps.items():
    func = cfg.kb.functions.floor_func(addr)
    print(f"At {addr:#x} in {func.name if func else 'unknown'}")
    if jump.resolved_targets:
        print(f"  Targets: {[hex(t) for t in jump.resolved_targets]}")
    else:
        print("  Unresolved")

# Export to other formats
nx.write_gexf(cfg.graph, "cfg.gexf")

Get Started

Core Concepts

Analyses

Advanced Topics

Guides

Control-Flow Graph (CFG)

Overview

CFGFast: Static Analysis

Basic Usage

Key Features

Configuration Options

Accessing CFG Data

CFGEmulated: Dynamic Analysis

Basic Usage

Context Sensitivity

Advanced Options

Function Manager

CFGNode Properties

Indirect Jump Resolution

Comparing CFGFast vs CFGEmulated

Example: Full CFG Analysis

Next Steps

Data Flow Analysis

Decompiler

Build docs developers (and LLMs) love

Get Started

Core Concepts

Analyses

Advanced Topics

Guides

​Overview

​CFGFast: Static Analysis

​Basic Usage

​Key Features

​Configuration Options

​Accessing CFG Data

​CFGEmulated: Dynamic Analysis

​Basic Usage

​Context Sensitivity

​Advanced Options

​Function Manager

​CFGNode Properties

​Indirect Jump Resolution

​Comparing CFGFast vs CFGEmulated

​Example: Full CFG Analysis

​Next Steps

Data Flow Analysis

Decompiler

Build docs developers (and LLMs) love

Overview

CFGFast: Static Analysis

Basic Usage

Key Features

Configuration Options

Accessing CFG Data

CFGEmulated: Dynamic Analysis

Basic Usage

Context Sensitivity

Advanced Options

Function Manager

CFGNode Properties

Indirect Jump Resolution

Comparing CFGFast vs CFGEmulated

Example: Full CFG Analysis

Next Steps