Graph Analysis

IPED includes powerful graph analysis capabilities for visualizing and analyzing communication networks, relationships, and connections between people, devices, and entities extracted from digital evidence.

Overview

Graph analysis enables:

Communication visualization - See who contacted whom and when
Relationship mapping - Identify connections between people
Network analysis - Understand communication patterns and structures
Entity linking - Connect phone numbers, emails, accounts to persons
Social network reconstruction - Map social relationships from digital evidence

Powered by Neo4j

IPED uses Neo4j graph database for storage and analysis:

public static final String DB_NAME = "graph.db";
public static final String DB_HOME_DIR = "neo4j";

Benefits:

Optimized for relationship queries
Cypher query language
Built-in graph algorithms
Scalable to millions of nodes
Visual graph browser

Graph Components

Nodes (Entities)

Graph nodes represent entities:

Person

Individuals identified by:

Phone number (primary)
Email address
User account (service + username)
Name, organization, address

Phone

Phone numbers:

International format (+1-555-123-4567)
Linked to Person nodes
Associated contacts

Email

Email addresses:

Normalized lowercase
Linked to Person nodes
Associated name

Contact Group

Chat groups and distribution lists:

WhatsApp groups
Email distribution lists
Telegram channels

Data Source

Evidence sources:

Device identifiers
Evidence UUID
MSISDN (mobile number)

WiFi Network

Wireless networks:

SSID (network name)
BSSID (MAC address)

Relationships (Edges)

Graph relationships represent interactions:

Message

Instant messages:

WhatsApp, Telegram, Discord, Skype
UFED mobile extractions
Direction: sender → recipient

Email

Email communications:

Sent messages
CC and BCC included
Direction: sender → recipient(s)

Call

Phone calls:

Voice and video calls
Call duration
Direction: caller → recipient

Contact

Stored contacts:

Address book entries
Direction: device owner → contact

User Account

Account ownership:

Social media accounts
Application accounts
Direction: person → account

Wireless Network

WiFi connections:

Network associations
Direction: device → network

Data Extraction

Graph task processes evidence items to extract entities and relationships:

public void process(IItem evidence) {
    if (!isEnabled() || !evidence.isToAddToCase()) {
        return;
    }
    
    processCommunicationMetadata(evidence);
    processContacts(evidence);
    processWifi(evidence);
    processUserAccount(evidence);
    processExtraAttributes(evidence);
}

Communication Processing

private void processCommunicationMetadata(IItem evidence) {
    Metadata metadata = evidence.getMetadata();
    String sender = metadata.get(ExtraProperties.COMMUNICATION_FROM);
    String[] recipients = metadata.getValues(ExtraProperties.COMMUNICATION_TO);
    
    // Create sender node
    NodeValues senderNode = getNodeValues(sender, metadata, detectPhones);
    graphFileWriter.writeNode(senderNode);
    
    // Create relationship to each recipient
    for (String recipient : recipients) {
        NodeValues recipientNode = getNodeValues(recipient, metadata, detectPhones);
        graphFileWriter.writeNode(recipientNode);
        graphFileWriter.writeRelationship(
            senderNode, recipientNode, relationshipType, relProps);
    }
}

Entity Recognition

IPED automatically detects and normalizes entities:

Phone Number Detection

Uses Google’s libphonenumber library:

private SortedSet<String> getPhones(String value) {
    PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
    
    // Find all phone numbers in text
    for (PhoneNumberMatch m : phoneUtil.findNumbers(
            value, phoneRegion, Leniency.POSSIBLE, Integer.MAX_VALUE)) {
        
        PhoneNumber phoneNumber = m.number();
        
        // Format to international standard
        String phone = phoneUtil.format(phoneNumber, 
                                        PhoneNumberFormat.INTERNATIONAL);
        
        result.add(phone);
    }
}

Features:

International format normalization
Multiple formats recognized
Country code detection
WhatsApp ID parsing ([email protected])
Brazil-specific handling (9th digit)

Email Detection

Pattern-based extraction:

private static Pattern emailPattern = Pattern.compile(
    "[0-9a-zA-Z\\+\\.\\_\\%\\-\\#\\!]{1,64}" +
    "\\@[0-9a-zA-Z\\-]{2,64}" +
    "(\\.[0-9a-zA-Z\\-]{2,25}){1,3}");

private SortedSet<String> getEmails(String text) {
    Matcher matcher = emailPattern.matcher(text);
    while (matcher.find()) {
        String email = matcher.group().toLowerCase();
        result.add(email);
    }
}

Features:

Normalized to lowercase
Extracts from formatted text (e.g., “Name <[email protected]>”)
Validates basic format
Handles multiple emails

Account Detection

Service-specific account parsing:

private NodeValues getAccountNodeValues(String value, Metadata meta) {
    String service = meta.get(ExtraProperties.USER_ACCOUNT_TYPE);
    
    // Parse format: "Name (username)"
    int idx = value.lastIndexOf('(');
    if (idx != -1 && value.endsWith(")")) {
        String account = value.substring(idx + 1, value.length() - 1);
        String name = value.substring(0, idx).trim();
        
        return new NodeValues(
            PERSON_LABEL, 
            USER_ACCOUNT,
            getServiceAccount(account, service)  // "username (Skype)"
        );
    }
}

Supported services:

Skype
Telegram
Discord
Instagram
Facebook
Twitter
Custom applications

Node Merging

IPED intelligently merges nodes representing the same entity:

private NodeValues writePersonNode(IItem item, SortedSet<String> msisdnPhones) {
    // Collect all identifiers
    SortedSet<String> emails = getEmails(metadata);
    SortedSet<String> phones = getPhones(metadata);
    String[] accounts = metadata.getValues(ExtraProperties.USER_ACCOUNT);
    
    // Create node with primary identifier
    NodeValues personNode;
    if (!phones.isEmpty()) {
        personNode = new NodeValues(PERSON_LABEL, USER_PHONE, phones.first());
    } else if (!emails.isEmpty()) {
        personNode = new NodeValues(PERSON_LABEL, USER_EMAIL, emails.first());
    } else if (accounts.length > 0) {
        personNode = new NodeValues(PERSON_LABEL, USER_ACCOUNT, accounts[0]);
    }
    
    // Add all additional identifiers
    personNode.addProp(USER_PHONE, phones);
    personNode.addProp(USER_EMAIL, emails);
    personNode.addProp(USER_ACCOUNT, accounts);
    
    // Write node replacement rules
    for (String email : emails) {
        graphFileWriter.writeNodeReplace(PERSON_LABEL, USER_EMAIL, email, uniqueId);
    }
    for (String phone : phones) {
        graphFileWriter.writeNodeReplace(PERSON_LABEL, USER_PHONE, phone, uniqueId);
    }
}

Merging logic:

Person with phone +1-555-1234 merged with email [email protected]
Results in single Person node with both identifiers
All relationships preserved
Deduplication across data sources

Supported Data Sources

Graph extraction works with:

Mobile Apps

WhatsApp - Messages, calls, groups, contacts
Telegram - Messages, calls, channels, contacts
Skype - Messages, calls, contacts
Discord - Messages, calls, servers

Email

Outlook PST - Emails, contacts
EML files - Individual messages
MBOX - Email archives

Mobile Extractions

UFED - Cellebrite extractions (calls, messages, contacts)
Android backups - App data
iOS backups - App data

Standard Formats

VCard - Contact files
CSV - Contact lists

Configuration

Graph task configured in GraphTaskConfig.txt:

# Enable graph generation
enableGraphTask = true

# Phone number region for parsing
phoneRegion = BR

# Include/exclude categories (regex)
includeCategoriesPattern = .*
excludeCategoriesPattern = (^$)

# MIME types for phone detection
detectPhonesOnMimes = application/x-whatsapp-message
dontDetectPhonesOnMimes = message/rfc822
detectPhonesOnOtherMimes = false

# Process proximity relationships (experimental)
processProximityRelationships = false
maxProximityDistance = 100

Phone Region

Critical for phone number parsing:

BR - Brazil (+55)
US - United States (+1)
GB - United Kingdom (+44)
DE - Germany (+49)
etc.

Correct region ensures:

Proper international format
National number recognition
Accurate merging

Category Filtering

Control which items contribute to graph:

# Only process communications
includeCategoriesPattern = (Chats|Email|Calls)

# Exclude system files  
excludeCategoriesPattern = (System Files|Carved)

Graph Generation

Graph built using Neo4j bulk import:

private void finishGraphGeneration() {
    logger.info("Generating graph database...");
    File graphDbHome = new File(output, DB_HOME_DIR);
    File graphCSVs = new File(output, CSVS_PATH);
    
    GraphGenerator graphGenerator = new GraphGenerator();
    graphGenerator.generate(graphDbHome, graphCSVs);
    
    logger.info("Generating graph database finished.");
}

Process:

During processing, write nodes/edges to CSV files
After processing completes, bulk import CSVs
Generate Neo4j database in output folder
Compress CSV files for archival

Analysis Interface

IPED analysis interface provides graph visualization:

Graph Viewer

Interactive visualization - Zoom, pan, drag nodes
Node coloring - By type (person, device, group)
Edge styling - By relationship type (message, call, email)
Filtering - Show/hide node and edge types
Search - Find specific entities
Expand/collapse - Show/hide related nodes

Communication Links Tab

Dedicated tab showing:

List of all communications
Sender and recipient columns
Communication type and time
Link to original evidence item
Sortable and filterable

Query Interface

Cypher query support:

// Find all contacts of person with phone
MATCH (p:PERSON {phone:"+55 11 98765-4321"})-[:message]->(c:PERSON)
RETURN c.phone, c.email, c.name

// Find communication paths between two people  
MATCH path = shortestPath(
  (p1:PERSON {phone:"+1-555-1234"})
  -[*]-(p2:PERSON {phone:"+1-555-5678"}))
RETURN path

// Find most active communicators
MATCH (p:PERSON)-[r:message]->()
RETURN p.phone, p.name, count(r) as messages
ORDER BY messages DESC
LIMIT 10

Graph Metrics

Calculate network metrics:

Degree Centrality

Number of direct connections:

Who has most contacts?
Communication hubs

Betweenness Centrality

How often node appears on shortest paths:

Information brokers
Critical links

Closeness Centrality

Average distance to all nodes:

Central figures
Information reach

Community Detection

Identify clusters:

Friend groups
Criminal cells
Organizational structure

Use Cases

Criminal Network Mapping

Identify organized crime structure:

Extract communications from all suspects
Generate graph showing connections
Identify leadership (high centrality)
Find intermediaries (high betweenness)
Detect cells/sub-groups (communities)

Person of Interest Investigation

Map associates of target:

Find target node in graph
Expand to show direct contacts
Expand contacts to 2nd degree
Identify unknown associates
Prioritize for investigation

Link Analysis

Connect seemingly unrelated cases:

Combine graphs from multiple cases
Identify shared contacts
Find communication between cases
Detect coordinated activity

Identify compromised accounts:

Map normal communication patterns
Detect anomalous connections
Find unusual outreach
Identify potential phishing

Proximity Relationships

Experimental feature linking entities based on proximity in text:

if (configuration.getProcessProximityRelationships()) {
    processExtraAttributes(evidence);
}

Logic:

Extracts regex hits (emails, phones, etc.)
Finds entities appearing near each other in text
Creates relationship if within maxProximityDistance characters
Useful for unstructured data (documents, web pages)

Example:

Document mentions “Contact John at [email protected] or +1-555-1234”
Creates Person node linking email and phone
Infers name “John”

Performance Considerations

Scalability

Tested with millions of communications
Neo4j handles large graphs efficiently
Bulk import faster than incremental
Graph queries optimized with indexes

Memory Usage

During processing:

CSV files written incrementally
Bounded memory usage
Node cache with LRU eviction

Processing Time

Graph generation:

1-2% of total processing time
Bulk import typically less than 10 minutes
Scales with number of communications

Export Options

CSV Export

Export nodes and edges as CSV:

Nodes: ID, labels, properties
Edges: source, target, type, properties
Import into external tools (Gephi, Tableau)

GraphML Export

Standard graph format:

Compatible with graph analysis tools
Preserves structure and attributes
Visualize in yEd, Cytoscape, etc.

Report Integration

Graph included in HTML reports:

Embedded visualization
Key metrics summary
Most active communicators
Network statistics

Best Practices

Set correct phone region - Critical for number normalization
Review merged nodes - Verify entities merged correctly
Filter noise - Exclude system communications and spam
Use communities - Identify sub-networks for focused analysis
Export for deep analysis - Use specialized tools for complex queries
Document findings - Annotate graph with investigation notes
Cross-reference evidence - Validate graph data with original items

Limitations

Requires structured metadata from parsers
Phone/email detection not 100% accurate
Entity resolution can merge wrong nodes
Group chats challenging to represent
Deleted participants may be missing
Timezone handling for global investigations

Getting Started

Processing Evidence

Analysis Interface

Core Features

Parsers & Artifacts

Advanced Usage

Reference

​Overview

​Powered by Neo4j

​Graph Components

​Nodes (Entities)

​Person

​Phone

​Email

​Contact Group

​Data Source

​WiFi Network

​Relationships (Edges)

​Message

​Email

​Call

​Contact

​User Account

​Wireless Network

​Data Extraction

​Communication Processing

​Entity Recognition

​Phone Number Detection

​Email Detection

​Account Detection

​Node Merging

​Supported Data Sources

​Mobile Apps

​Email

​Mobile Extractions

​Standard Formats

​Configuration

​Phone Region

​Category Filtering

​Graph Generation

​Analysis Interface

​Graph Viewer

​Communication Links Tab

​Query Interface

​Graph Metrics

​Degree Centrality

​Betweenness Centrality

​Closeness Centrality

​Community Detection

​Use Cases

​Criminal Network Mapping

​Person of Interest Investigation

​Link Analysis

​Social Engineering Detection

​Proximity Relationships

​Performance Considerations

​Scalability

​Memory Usage

​Processing Time

​Export Options

​CSV Export

​GraphML Export

​Report Integration

​Best Practices

​Limitations

Build docs developers (and LLMs) love

Overview

Powered by Neo4j

Graph Components

Nodes (Entities)

Person

Phone

Email

Contact Group

Data Source

WiFi Network

Relationships (Edges)

Message

Email

Call

Contact

User Account

Wireless Network

Data Extraction

Communication Processing

Entity Recognition

Phone Number Detection

Email Detection

Account Detection

Node Merging

Supported Data Sources

Mobile Apps

Email

Mobile Extractions

Standard Formats

Configuration

Phone Region

Category Filtering

Graph Generation

Analysis Interface

Graph Viewer

Communication Links Tab

Query Interface

Graph Metrics

Degree Centrality

Betweenness Centrality

Closeness Centrality

Community Detection

Use Cases

Criminal Network Mapping

Person of Interest Investigation

Link Analysis

Social Engineering Detection

Proximity Relationships

Performance Considerations

Scalability

Memory Usage

Processing Time

Export Options

CSV Export

GraphML Export

Report Integration

Best Practices

Limitations