Successful reverse engineering of obfuscated code begins with accurately identifying which obfuscation techniques have been applied. This guide covers systematic detection strategies.
Initial Triage
Before diving deep into analysis, perform a quick assessment to understand what you’re dealing with.
Basic Binary Analysis
# File type and architecture
file binary_name
# Check if stripped
nm binary_name | wc -l
# Identify encryption
otool -l binary_name | grep -A 5 LC_ENCRYPTION_INFO
# List dependencies
otool -L binary_name
String Analysis
# Extract strings
strings binary_name > strings.txt
# Count meaningful strings
wc -l strings.txt
# Look for patterns
grep -E '^[A-Za-z0-9+/]{20,}={0,2}$' strings.txt # Base64
grep -E '^[0-9a-fA-F]{32,}$' strings.txt # Hex
Few readable strings suggest string encryption or minimization.
Symbol Table Inspection
# List symbols
nm -a binary_name | head -50
# Count Objective-C classes
otool -oV binary_name | grep "class_name" | wc -l
# Check for symbol obfuscation
nm binary_name | grep -E '^[a-z]$' # Single char names
Code Size Analysis
# Segment sizes
otool -l binary_name | grep -A 3 "segname __TEXT"
# Section details
size binary_name
Unusually large code sections may indicate dead code injection.
Identifying Obfuscation Types
Control Flow
String Encryption
Symbol Obfuscation
Anti-Tampering
Control Flow Flattening Indicators
Disassembly Patterns: ; Dispatcher loop pattern
. loop_start:
ldr w8, [ sp , #state_var]
cmp w8, # 0
b.eq .case_0
cmp w8, # 1
b.eq .case_1
cmp w8, # 2
b.eq .case_2
; ... many cases
. case_0:
; block code
mov w8, #next_state
str w8, [ sp , #state_var]
b .loop_start
Key Indicators:
Central dispatch loop with switch/jump table
State variable controlling flow
All basic blocks at same nesting level
Unconditional jumps back to dispatcher
Use tools to visualize control flow graphs: # Generate CFG with radare2
r2 -A binary_name
[0x100000000] > s main
[0x100001234] > ag # ASCII graph
[0x100001234] > agf > cfg.dot # Graphviz format
# Visualize
dot -Tpng cfg.dot -o cfg.png
Look for:
Star topology (all blocks connect to center)
High fan-in to dispatcher block
Flat structure (no natural hierarchy)
Many back-edges to single block
Calculate complexity metrics: import r2pipe
r2 = r2pipe.open( "binary_name" )
r2.cmd( "aaa" ) # Analyze all
# Get function info
funcs = r2.cmdj( "aflj" ) # JSON format
for func in funcs:
name = func.get( "name" )
cyclomatic = func.get( "cc" , 0 ) # Cyclomatic complexity
nbbs = func.get( "nbbs" , 0 ) # Number of basic blocks
# High complexity with many blocks suggests flattening
if cyclomatic > 50 and nbbs > 20 :
print ( f "[!] Suspicious: { name } " )
print ( f " Complexity: { cyclomatic } " )
print ( f " Basic blocks: { nbbs } " )
String Obfuscation Detection Entropy Analysis
Pattern Recognition
Dynamic Detection
# Calculate entropy of data sections
import math
from collections import Counter
def calculate_entropy ( data ) :
if not data:
return 0
entropy = 0
counter = Counter ( data )
length = len ( data )
for count in counter.values () :
probability = count / length
entropy -= probability * math.log2 ( probability )
return entropy
# Read binary sections
with open ( 'binary_name' , 'rb' ) as f:
data = f.read ()
# High entropy (>7.0) suggests encryption
entropy = calculate_entropy ( data )
print ( f "Entropy: {entropy:.2f}" )
Indicators:
Absence of expected strings in static analysis
Presence of high-entropy byte arrays
Crypto library imports (CommonCrypto, OpenSSL)
Functions with XOR, shift, or substitution patterns
Runtime string creation from byte arrays
Detecting Symbol Renaming # Extract and analyze symbols
nm binary_name > symbols.txt
# Count single-character symbols
grep -E '^[0-9a-f]+ [A-Z] [a-z]$' symbols.txt | wc -l
# Look for patterns
grep -E '^[0-9a-f]+ [A-Z] (a|b|c|aa|ab|a1|a2)$' symbols.txt
# Check class names
otool -oV binary_name | grep "class_name" | head -20
Red Flags:
Short Names Class and method names like a, b, c instead of descriptive names.
Sequential Patterns Names following patterns: a1, a2, a3 or func_1, func_2.
Random Strings Completely random names like xK7pQ2m with no semantic meaning.
Missing Prefixes Objective-C classes without standard prefixes (NS, UI, etc.).
Analysis Script import re
from collections import Counter
def analyze_symbols ( symbol_file ):
with open (symbol_file, 'r' ) as f:
symbols = f.readlines()
# Extract symbol names
names = []
for line in symbols:
match = re.search( r ' [ A-Z ] ( \w + ) ' , line)
if match:
names.append(match.group( 1 ))
# Statistics
single_char = sum ( 1 for n in names if len (n) == 1 )
short_names = sum ( 1 for n in names if len (n) <= 3 )
print ( f "Total symbols: { len (names) } " )
print ( f "Single character: { single_char } "
f "( { single_char / len (names) * 100 :.1f} %)" )
print ( f "3 chars or less: { short_names } "
f "( { short_names / len (names) * 100 :.1f} %)" )
# If >30% are very short, likely obfuscated
if short_names / len (names) > 0.3 :
print ( " \n [!] Symbol obfuscation detected" )
return True
return False
analyze_symbols( 'symbols.txt' )
Runtime Protection Detection
Search for Protection APIs
# Look for common APIs
nm binary_name | grep -i "ptrace\|sysctl\|fork\|dlopen"
# Search in disassembly
r2 -A binary_name
[0x100000000] > afl | grep -i "check\|detect\|verify\|validate"
# Find syscalls
r2 -A binary_name
[0x100000000] > /c svc # System calls on ARM
String Search
# Jailbreak detection indicators
strings binary_name | grep -i \
"cydia\|substrate\|jailbreak\|icy\|apt\|ssh\|bash"
# Debugger detection
strings binary_name | grep -i \
"debug\|lldb\|gdb\|trace\|breakpoint"
Code Pattern Search
# Search for ptrace(PT_DENY_ATTACH) pattern
import r2pipe
r2 = r2pipe.open( "binary_name" )
r2.cmd( "aaa" )
# Find ptrace calls
ptrace_refs = r2.cmdj( "/j ptrace" )
for ref in ptrace_refs:
addr = ref.get( "offset" )
# Disassemble around the call
r2.cmd( f "s { addr } " )
disasm = r2.cmd( "pd 10 @ {} " .format(addr - 20 ))
# Look for PT_DENY_ATTACH (31 or 0x1f)
if "#31" in disasm or "#0x1f" in disasm:
print ( f "[!] PT_DENY_ATTACH at { hex (addr) } " )
Protection Checklist: Environment
Debugging
Integrity
DIE (Detect It Easy) Cross-platform tool for detecting packers, obfuscators, and compilers. # Install
brew install detect-it-easy
# Analyze binary
diec binary_name
r2pipe Scripts Custom detection scripts using radare2 Python bindings. import r2pipe
r2 = r2pipe.open( "binary" )
r2.cmd( "aaa" )
info = r2.cmdj( "ij" )
MobSF Mobile Security Framework with obfuscation detection. # Run analysis
python3 manage.py runserver
# Upload IPA via web interface
Ghidra Scripts Custom Ghidra analyzers for pattern detection. // FlatteningDetector.java
// Custom Ghidra script
Pattern Recognition
Building a Detection Matrix
Create a systematic checklist to identify multiple obfuscation techniques simultaneously.
class ObfuscationDetector :
def __init__ ( self , binary_path ):
self .binary = binary_path
self .results = {}
def detect_all ( self ):
print ( "[*] Analyzing {} \n " .format( self .binary))
self .results[ 'control_flow' ] = self .check_control_flow()
self .results[ 'strings' ] = self .check_string_encryption()
self .results[ 'symbols' ] = self .check_symbol_obfuscation()
self .results[ 'anti_debug' ] = self .check_anti_debug()
self .results[ 'anti_jailbreak' ] = self .check_anti_jailbreak()
self .print_report()
def check_control_flow ( self ):
# Analyze CFG complexity
return {
'detected' : True ,
'confidence' : 'high' ,
'evidence' : [ 'High cyclomatic complexity' , 'Dispatcher pattern' ]
}
def check_string_encryption ( self ):
# Entropy analysis
return {
'detected' : True ,
'confidence' : 'medium' ,
'evidence' : [ 'High entropy sections' , 'Few readable strings' ]
}
def check_symbol_obfuscation ( self ):
# Symbol analysis
return {
'detected' : True ,
'confidence' : 'high' ,
'evidence' : [ '40 % s ingle-char symbols' ]
}
def check_anti_debug ( self ):
# API detection
return {
'detected' : True ,
'confidence' : 'high' ,
'evidence' : [ 'ptrace calls' , 'sysctl checks' ]
}
def check_anti_jailbreak ( self ):
# File/path checks
return {
'detected' : True ,
'confidence' : 'high' ,
'evidence' : [ 'Cydia path checks' , 'fork attempts' ]
}
def print_report ( self ):
print ( "=" * 60 )
print ( "OBFUSCATION DETECTION REPORT" )
print ( "=" * 60 )
for technique, result in self .results.items():
status = "[DETECTED]" if result[ 'detected' ] else "[NOT FOUND]"
print ( f " \n { status } { technique.replace( '_' , ' ' ).title() } " )
print ( f " Confidence: { result[ 'confidence' ] } " )
print ( f " Evidence:" )
for evidence in result[ 'evidence' ]:
print ( f " - { evidence } " )
# Usage
detector = ObfuscationDetector( 'binary_name' )
detector.detect_all()
Workflow Diagram
Best Practices
Begin with automated tools and high-level indicators before diving into detailed analysis. This saves time and helps prioritize efforts.
Combine Static and Dynamic
Static analysis alone can miss runtime obfuscation. Dynamic analysis alone can miss dormant protections. Use both approaches.
Keep detailed notes on detected techniques, evidence, and confidence levels. This helps when returning to the analysis later.
Build a Reference Library
Maintain a collection of known obfuscation patterns, tool outputs, and bypass techniques for future reference.
Initial detection may be incomplete. As you analyze deeper, you may discover additional obfuscation layers. Update your assessment.
Common Pitfalls
False Positives : Compiler optimizations can resemble obfuscation. Verify with multiple indicators before concluding obfuscation is present.
Layered Obfuscation : Detecting one technique doesn’t mean there aren’t others. Always check for multiple simultaneous techniques.
Evolving Techniques : Obfuscators constantly evolve. Patterns that worked yesterday may not apply to new versions. Stay current.
Further Reading
Obfuscation Overview Return to the overview for context on why obfuscation is used.
Control Flow Flattening Deep dive into analyzing detected control flow obfuscation.
String Encryption Learn to extract and decrypt obfuscated strings.
Anti-Tampering Bypass detected runtime protections.