Skip to main content
The ReachingDefinitionsAnalysis (RDA) is a text-book implementation of static data-flow analysis that works on functions or basic blocks. It supports both VEX and AIL (angr Intermediate Language). By registering observers at observation points, you can use RDA to generate:
  • Use-def chains
  • Def-use chains
  • Reaching definitions
  • Liveness analysis
  • Other traditional data-flow analyses

Constructor

ReachingDefinitionsAnalysis(
    subject,
    func_graph=None,
    max_iterations=30,
    track_tmps=False,
    track_consts=True,
    observation_points=None,
    init_state=None,
    cc=None,
    function_handler=None,
    dep_graph=True,
    observe_all=False,
    canonical_size=8,
    track_liveness=True,
    element_limit=5,
    merge_into_tops=True,
    **kwargs
)
subject
Subject | Function | Block
required
The subject of analysis - a function or a single basic block. Can also be a Subject wrapper object.
func_graph
networkx.DiGraph | None
Alternative graph to use instead of function.graph.
max_iterations
int
default:"30"
Maximum number of iterations before the analysis is terminated.
track_tmps
bool
default:"False"
Whether temporary variables should be tracked during analysis.
track_consts
bool
default:"True"
Whether constant values should be tracked.
observation_points
Iterable[ObservationPoint] | None
Collection of tuples defining where reaching definitions should be captured. Format: ("node"|"insn"|"stmt", address, OP_BEFORE|OP_AFTER).
init_state
ReachingDefinitionsState | None
Optional initialization state. If provided, the analysis works on a copy.
cc
SimCC | None
Calling convention of the function.
function_handler
FunctionHandler | None
Function handler to update analysis state on function calls. If None, a default handler is created.
dep_graph
DepGraph | bool | None
default:"True"
  • True: Generate a dependency graph
  • False or None: Skip dependency graph generation
  • DepGraph instance: Use existing graph
observe_all
bool
default:"False"
Observe every statement, both before and after execution.
canonical_size
int
default:"8"
Size in bytes that objects with UNKNOWN_SIZE are treated as.
track_liveness
bool
default:"True"
Whether to track liveness information. Can consume significant RAM on large functions.
element_limit
int
default:"5"
Maximum number of elements in definition sets before merging to TOP.
merge_into_tops
bool
default:"True"
Merge known values into TOP if TOP is present.
  • True: {TOP} ∨ {0xabc} = {TOP}
  • False: {TOP} ∨ {0xabc} = {TOP, 0xabc}

Advanced Parameters

interfunction_level
int
default:"0"
Number of functions to recurse into for interprocedural analysis. Only used if function_handler is not provided.
use_callee_saved_regs_at_return
bool
default:"True"
Whether to use callee-saved registers at function returns.

Properties

observed_results
dict[ObservationPoint, LiveDefinitions]
Dictionary mapping observation points to their corresponding LiveDefinitions at that point.
all_definitions
set
Set of all definitions encountered during analysis.
all_uses
set
Set of all uses encountered during analysis.
dep_graph
DepGraph
The dependency graph if dep_graph=True was specified.
model
ReachingDefinitionsModel
The model containing all analysis results and live definitions.
one_result
LiveDefinitions
Convenience property to get a single result when only one observation point exists. Raises ValueError if zero or multiple results exist.
visited_blocks
set
Set of all blocks visited during analysis.

Methods

get_reaching_definitions_by_insn(ins_addr, op_type)

Get reaching definitions at a specific instruction.
ins_addr
int
required
The instruction address.
op_type
OP_BEFORE | OP_AFTER
required
Whether to get definitions before or after the instruction.
Returns: LiveDefinitions object. Raises: KeyError if the observation point was not registered.
from angr.knowledge_plugins.key_definitions.constants import OP_BEFORE, OP_AFTER

rda = project.analyses.ReachingDefinitions(
    func,
    observation_points=[('insn', 0x401234, OP_AFTER)]
)

live_defs = rda.get_reaching_definitions_by_insn(0x401234, OP_AFTER)

get_reaching_definitions_by_node(node_addr, op_type)

Get reaching definitions at a CFG node.
node_addr
int
required
The node address.
op_type
OP_BEFORE | OP_AFTER
required
Whether to get definitions before or after the node.
Returns: LiveDefinitions object.
live_defs = rda.get_reaching_definitions_by_node(0x401000, OP_BEFORE)

Example Usage

Basic Analysis

import angr

project = angr.Project('/bin/true')
cfg = project.analyses.CFGFast()

# Analyze main function
main_func = project.kb.functions['main']
rda = project.analyses.ReachingDefinitions(subject=main_func)

print(f"Visited {len(rda.visited_blocks)} blocks")
print(f"Found {len(rda.all_definitions)} definitions")
print(f"Found {len(rda.all_uses)} uses")

Observation Points

from angr.knowledge_plugins.key_definitions.constants import OP_BEFORE, OP_AFTER

# Define observation points
obs_points = [
    ('node', 0x401000, OP_BEFORE),   # Before first block
    ('insn', 0x401234, OP_AFTER),    # After specific instruction
    ('node', 0x401500, OP_AFTER),    # After another block
]

rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    observation_points=obs_points
)

# Access observed definitions
for obs_point, live_defs in rda.observed_results.items():
    print(f"At {obs_point}:")
    print(f"  Register definitions: {len(live_defs.register_definitions)}")
    print(f"  Stack definitions: {len(live_defs.stack_definitions)}")

Observe All Statements

# Observe every statement
rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    observe_all=True
)

# Iterate through all observation points
for (obs_type, addr, op_type), live_defs in rda.observed_results.items():
    print(f"{obs_type} at {hex(addr)} ({op_type})")

Working with LiveDefinitions

from angr.knowledge_plugins.key_definitions.constants import OP_AFTER

rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    observation_points=[('node', main_func.addr, OP_AFTER)]
)

live_defs = rda.get_reaching_definitions_by_node(main_func.addr, OP_AFTER)

# Access register definitions
for reg_offset, definitions in live_defs.register_definitions.items():
    print(f"Register {reg_offset}: {definitions}")

# Access stack definitions  
for stack_offset, definitions in live_defs.stack_definitions.items():
    print(f"Stack offset {stack_offset}: {definitions}")

Interprocedural Analysis

# Analyze with call depth of 2
rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    interfunction_level=2
)

# Access function call information
for code_loc, call_info in rda.function_calls.items():
    print(f"Call at {code_loc}: target={hex(call_info.target)}")

Using Custom Function Handler

from angr.analyses.reaching_definitions.function_handler import FunctionHandler

class CustomHandler(FunctionHandler):
    def handle_function(self, state, func):
        # Custom logic for handling function calls
        return state

handler = CustomHandler()
rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    function_handler=handler
)

Dependency Graph

# Generate with dependency graph
rda = project.analyses.ReachingDefinitions(
    subject=main_func,
    dep_graph=True
)

# Access the dependency graph
dep_graph = rda.dep_graph

# Query dependencies
for node in dep_graph.graph.nodes():
    deps = dep_graph.dependencies(node)
    print(f"{node} depends on: {deps}")

Single Block Analysis

# Analyze a single block
block = project.factory.block(0x401000)

rda = project.analyses.ReachingDefinitions(
    subject=block,
    observation_points=[('node', block.addr, OP_AFTER)]
)

result = rda.one_result  # Convenient for single observation point

AIL-based Analysis

import angr.ailment as ailment

# First convert to AIL
clinic = project.analyses.Clinic(main_func)

# Create AIL block subject
for node in clinic.graph.nodes():
    if isinstance(node, ailment.Block):
        rda = project.analyses.ReachingDefinitions(
            subject=node,
            track_tmps=True
        )
        break

Observation Point Types

Observation points are tuples with the format: (type, identifier, op_type)
node
string
Observe at the beginning or end of a CFG node.
  • Identifier: node address (int) or (address, idx) for AIL
  • Example: ('node', 0x401000, OP_BEFORE)
insn
string
Observe at a specific instruction.
  • Identifier: instruction address (int)
  • Example: ('insn', 0x401234, OP_AFTER)
stmt
string
Observe at a specific statement.
  • Identifier: (block_addr, block_idx, stmt_idx)
  • Example: ('stmt', (0x401000, None, 5), OP_BEFORE)
exit
string
Observe at a block exit.
  • Identifier: (node_address, exit_stmt_idx)
  • Example: ('exit', (0x401000, 10), OP_AFTER)

Performance Considerations

Memory Usage: Enabling track_liveness=True can consume significant memory (15GB+ for functions with 4000+ nodes). Disable it for large functions if liveness information is not needed.

Optimization Tips

  1. Limit observation points: Only observe what you need
  2. Disable liveness tracking for large functions
  3. Adjust element_limit: Lower values merge to TOP sooner, reducing memory
  4. Use merge_into_tops=True: Reduces set sizes
  5. Limit max_iterations: Stop early if convergence is slow
# Memory-optimized configuration
rda = project.analyses.ReachingDefinitions(
    subject=large_func,
    track_liveness=False,
    element_limit=3,
    merge_into_tops=True,
    max_iterations=20
)

Build docs developers (and LLMs) love