Overview
Graph analysis enables:- Communication visualization - See who contacted whom and when
- Relationship mapping - Identify connections between people
- Network analysis - Understand communication patterns and structures
- Entity linking - Connect phone numbers, emails, accounts to persons
- Social network reconstruction - Map social relationships from digital evidence
Powered by Neo4j
IPED uses Neo4j graph database for storage and analysis:- Optimized for relationship queries
- Cypher query language
- Built-in graph algorithms
- Scalable to millions of nodes
- Visual graph browser
Graph Components
Nodes (Entities)
Graph nodes represent entities:Person
Individuals identified by:- Phone number (primary)
- Email address
- User account (service + username)
- Name, organization, address
Phone
Phone numbers:- International format (+1-555-123-4567)
- Linked to Person nodes
- Associated contacts
- Normalized lowercase
- Linked to Person nodes
- Associated name
Contact Group
Chat groups and distribution lists:- WhatsApp groups
- Email distribution lists
- Telegram channels
Data Source
Evidence sources:- Device identifiers
- Evidence UUID
- MSISDN (mobile number)
WiFi Network
Wireless networks:- SSID (network name)
- BSSID (MAC address)
Relationships (Edges)
Graph relationships represent interactions:Message
Instant messages:- WhatsApp, Telegram, Discord, Skype
- UFED mobile extractions
- Direction: sender → recipient
- Sent messages
- CC and BCC included
- Direction: sender → recipient(s)
Call
Phone calls:- Voice and video calls
- Call duration
- Direction: caller → recipient
Contact
Stored contacts:- Address book entries
- Direction: device owner → contact
User Account
Account ownership:- Social media accounts
- Application accounts
- Direction: person → account
Wireless Network
WiFi connections:- Network associations
- Direction: device → network
Data Extraction
Graph task processes evidence items to extract entities and relationships:Communication Processing
Entity Recognition
IPED automatically detects and normalizes entities:Phone Number Detection
Uses Google’s libphonenumber library:- International format normalization
- Multiple formats recognized
- Country code detection
- WhatsApp ID parsing ([email protected])
- Brazil-specific handling (9th digit)
Email Detection
Pattern-based extraction:- Normalized to lowercase
- Extracts from formatted text (e.g., “Name <[email protected]>”)
- Validates basic format
- Handles multiple emails
Account Detection
Service-specific account parsing:- Skype
- Telegram
- Discord
- Custom applications
Node Merging
IPED intelligently merges nodes representing the same entity:- Person with phone +1-555-1234 merged with email [email protected]
- Results in single Person node with both identifiers
- All relationships preserved
- Deduplication across data sources
Supported Data Sources
Graph extraction works with:Mobile Apps
- WhatsApp - Messages, calls, groups, contacts
- Telegram - Messages, calls, channels, contacts
- Skype - Messages, calls, contacts
- Discord - Messages, calls, servers
- Outlook PST - Emails, contacts
- EML files - Individual messages
- MBOX - Email archives
Mobile Extractions
- UFED - Cellebrite extractions (calls, messages, contacts)
- Android backups - App data
- iOS backups - App data
Standard Formats
- VCard - Contact files
- CSV - Contact lists
Configuration
Graph task configured inGraphTaskConfig.txt:
Phone Region
Critical for phone number parsing:- BR - Brazil (+55)
- US - United States (+1)
- GB - United Kingdom (+44)
- DE - Germany (+49)
- etc.
- Proper international format
- National number recognition
- Accurate merging
Category Filtering
Control which items contribute to graph:Graph Generation
Graph built using Neo4j bulk import:- During processing, write nodes/edges to CSV files
- After processing completes, bulk import CSVs
- Generate Neo4j database in output folder
- Compress CSV files for archival
Analysis Interface
IPED analysis interface provides graph visualization:Graph Viewer
- Interactive visualization - Zoom, pan, drag nodes
- Node coloring - By type (person, device, group)
- Edge styling - By relationship type (message, call, email)
- Filtering - Show/hide node and edge types
- Search - Find specific entities
- Expand/collapse - Show/hide related nodes
Communication Links Tab
Dedicated tab showing:- List of all communications
- Sender and recipient columns
- Communication type and time
- Link to original evidence item
- Sortable and filterable
Query Interface
Cypher query support:Graph Metrics
Calculate network metrics:Degree Centrality
Number of direct connections:- Who has most contacts?
- Communication hubs
Betweenness Centrality
How often node appears on shortest paths:- Information brokers
- Critical links
Closeness Centrality
Average distance to all nodes:- Central figures
- Information reach
Community Detection
Identify clusters:- Friend groups
- Criminal cells
- Organizational structure
Use Cases
Criminal Network Mapping
Identify organized crime structure:- Extract communications from all suspects
- Generate graph showing connections
- Identify leadership (high centrality)
- Find intermediaries (high betweenness)
- Detect cells/sub-groups (communities)
Person of Interest Investigation
Map associates of target:- Find target node in graph
- Expand to show direct contacts
- Expand contacts to 2nd degree
- Identify unknown associates
- Prioritize for investigation
Link Analysis
Connect seemingly unrelated cases:- Combine graphs from multiple cases
- Identify shared contacts
- Find communication between cases
- Detect coordinated activity
Social Engineering Detection
Identify compromised accounts:- Map normal communication patterns
- Detect anomalous connections
- Find unusual outreach
- Identify potential phishing
Proximity Relationships
Experimental feature linking entities based on proximity in text:- Extracts regex hits (emails, phones, etc.)
- Finds entities appearing near each other in text
- Creates relationship if within maxProximityDistance characters
- Useful for unstructured data (documents, web pages)
- Document mentions “Contact John at [email protected] or +1-555-1234”
- Creates Person node linking email and phone
- Infers name “John”
Performance Considerations
Scalability
- Tested with millions of communications
- Neo4j handles large graphs efficiently
- Bulk import faster than incremental
- Graph queries optimized with indexes
Memory Usage
During processing:- CSV files written incrementally
- Bounded memory usage
- Node cache with LRU eviction
Processing Time
Graph generation:- 1-2% of total processing time
- Bulk import typically less than 10 minutes
- Scales with number of communications
Export Options
CSV Export
Export nodes and edges as CSV:- Nodes: ID, labels, properties
- Edges: source, target, type, properties
- Import into external tools (Gephi, Tableau)
GraphML Export
Standard graph format:- Compatible with graph analysis tools
- Preserves structure and attributes
- Visualize in yEd, Cytoscape, etc.
Report Integration
Graph included in HTML reports:- Embedded visualization
- Key metrics summary
- Most active communicators
- Network statistics
Best Practices
- Set correct phone region - Critical for number normalization
- Review merged nodes - Verify entities merged correctly
- Filter noise - Exclude system communications and spam
- Use communities - Identify sub-networks for focused analysis
- Export for deep analysis - Use specialized tools for complex queries
- Document findings - Annotate graph with investigation notes
- Cross-reference evidence - Validate graph data with original items
Limitations
- Requires structured metadata from parsers
- Phone/email detection not 100% accurate
- Entity resolution can merge wrong nodes
- Group chats challenging to represent
- Deleted participants may be missing
- Timezone handling for global investigations