Skip to main content
The Search API enables powerful querying of evidence items using Lucene’s query syntax with IPED-specific enhancements.

IPEDSearcher Class

Package: iped.engine.search
Source: iped-engine/src/main/java/iped/engine/search/IPEDSearcher.java
The IPEDSearcher class is the primary interface for searching IPED cases.

Creating a Searcher

IPEDSearcher(IPEDSource ipedCase)
constructor
Creates a searcher for the specified case without a predefined query
IPEDSearcher(IPEDSource ipedCase, String query)
constructor
Creates a searcher with a query string. The query is parsed using IPED’s query syntax.
IPEDSearcher(IPEDSource ipedCase, Query query)
constructor
Creates a searcher with a Lucene Query object

Basic Search Example

import iped.engine.data.IPEDSource;
import iped.engine.search.IPEDSearcher;
import iped.search.SearchResult;
import iped.data.IItem;

File caseDir = new File("/path/to/case");
IPEDSource source = new IPEDSource(caseDir);

// Create searcher with query
IPEDSearcher searcher = new IPEDSearcher(source, "type:pdf");

// Execute search
SearchResult result = searcher.search();

System.out.println("Found " + result.getLength() + " items");

// Iterate results
for (int id : result.getIds()) {
    IItem item = source.getItemByID(id);
    System.out.println(item.getName());
}

source.close();

Query Methods

setQuery(String queryText)
void
Sets the query from a text string. Throws RuntimeException wrapping ParseException or QueryNodeException if the query syntax is invalid.
setQuery(Query query)
void
Sets a Lucene Query object directly
getQuery()
Query
Returns the current Lucene Query object

Query Configuration

setTreeQuery(boolean treeQuery)
void
If false (default), excludes tree nodes from results. Tree nodes are internal structural items.
setNoScoring(boolean noScore)
void
If true, disables relevance scoring for faster performance. Automatically enabled for result sets larger than 1 million items.
setRewritequery(boolean rewriteQuery)
void
If true (default), rewrites queries for optimization

Example: Advanced Configuration

IPEDSearcher searcher = new IPEDSearcher(source);

// Configure search behavior
searcher.setQuery("content:confidential");
searcher.setNoScoring(true);  // Faster for large result sets
searcher.setTreeQuery(false); // Exclude tree nodes

SearchResult result = searcher.search();

Executing Searches

Executes the search on a single IPEDSource and returns results. Throws IOException on index access errors.
Executes the search on an IPEDMultiSource (multiple cases) and returns combined results. Throws IOException on index access errors.
cancel()
void
Cancels an ongoing search operation

Multi-Case Search Example

import iped.engine.data.IPEDMultiSource;
import iped.data.IItemId;
import iped.search.IMultiSearchResult;

// Create multi-source from multiple cases
List<IIPEDSource> sources = new ArrayList<>();
sources.add(new IPEDSource(new File("/case1")));
sources.add(new IPEDSource(new File("/case2")));

IPEDMultiSource multiSource = new IPEDMultiSource(sources);

// Search across all cases
IPEDSearcher searcher = new IPEDSearcher(multiSource, "bitcoin");
IMultiSearchResult result = searcher.multiSearch();

System.out.println("Found " + result.getLength() + " items across cases");

// Iterate with source information
for (IItemId itemId : result.getIterator()) {
    int sourceId = itemId.getSourceId();
    int id = itemId.getId();
    IItem item = multiSource.getAtomicSourceBySourceId(sourceId).getItemByID(id);
    System.out.println("Source " + sourceId + ": " + item.getName());
}

SearchResult Interface

Package: iped.search

Result Methods

getLength()
int
Returns the number of items in the search result
getIds()
int[]
Returns an array of item IDs in the result set
getItem(int index)
IItemId
Returns the IItemId at the specified index in the result list
getIterator()
Iterator<IItemId>
Returns an iterator over the result set

Example: Processing Results

SearchResult result = searcher.search();

// Get result count
int count = result.getLength();
System.out.println("Total results: " + count);

// Get all IDs at once
int[] ids = result.getIds();
for (int id : ids) {
    processItem(source.getItemByID(id));
}

// Or iterate
for (IItemId itemId : result.getIterator()) {
    IItem item = source.getItemByID(itemId.getId());
    processItem(item);
}

Query Syntax

IPED uses Lucene query syntax with field-specific searches.

Basic Syntax

QueryDescription
passwordSearch for term in content
name:report.pdfSearch in specific field
"exact phrase"Phrase search
pass*Wildcard search
password AND secretBoolean AND
pdf OR docBoolean OR
password NOT publicBoolean NOT
pass?ordSingle character wildcard

Field Names

Common indexed fields:
name
String
Filename
path
String
Full file path
type
String
Detected file type extension
ext
String
Original file extension
category
String
Item category (images, videos, documents, etc.)
content
String
Extracted text content (default field if no field specified)
length
Long
File size in bytes
hash
String
Hash value (MD5, SHA-1, SHA-256, etc.)
deleted
Boolean
Whether file is deleted
carved
Boolean
Whether file was recovered by carving

Query Examples

// Search by file type
IPEDSearcher searcher = new IPEDSearcher(source, "type:pdf");

// Search by category
searcher.setQuery("category:images");

// Complex boolean query
searcher.setQuery("(type:pdf OR type:doc) AND content:confidential");

// Find deleted files
searcher.setQuery("deleted:true");

// Find large files (> 100MB)
searcher.setQuery("length:[104857600 TO *]");

// Find files by hash
searcher.setQuery("hash:d41d8cd98f00b204e9800998ecf8427e");

// Wildcard filename search
searcher.setQuery("name:*.exe");

// Phrase search in content
searcher.setQuery("content:\"social security number\"");

// Date range (if indexed)
searcher.setQuery("modificationDate:[2023-01-01 TO 2023-12-31]");

// Find carved images
searcher.setQuery("carved:true AND category:images");

Range Queries

// Numeric range
searcher.setQuery("length:[1000 TO 5000]");

// Open-ended range
searcher.setQuery("length:[10485760 TO *]"); // Files >= 10MB

// Date range
searcher.setQuery("modificationDate:[2023-01-01 TO 2023-12-31]");

Wildcards and Regex

// Wildcard
searcher.setQuery("name:report*.pdf");

// Single character wildcard
searcher.setQuery("name:file?.txt");

// Regular expression (use /regex/)
searcher.setQuery("name:/report[0-9]{4}\\.pdf/");

Boosting

// Boost terms for relevance
searcher.setQuery("password^2 secret"); // "password" is more important
// Words within 10 words of each other
searcher.setQuery("\"social security\"~10");

Advanced Search Patterns

Finding Suspicious Files

public List<IItem> findSuspiciousFiles(IPEDSource source) throws Exception {
    // Encrypted executables
    String query = "type:exe AND (content:encrypted OR category:encrypted)";
    
    IPEDSearcher searcher = new IPEDSearcher(source, query);
    searcher.setNoScoring(true);
    SearchResult result = searcher.search();
    
    List<IItem> items = new ArrayList<>();
    for (int id : result.getIds()) {
        items.add(source.getItemByID(id));
    }
    return items;
}

Filtering by Multiple Criteria

public SearchResult findDocumentsWithKeywords(IPEDSource source) throws Exception {
    // Documents containing specific keywords
    String query = "(type:pdf OR type:doc OR type:docx) AND " +
                   "(content:confidential OR content:\"trade secret\" OR content:proprietary)";
    
    IPEDSearcher searcher = new IPEDSearcher(source, query);
    return searcher.search();
}

Excluding Items

// Find all images except JPEGs
String query = "category:images NOT type:jpeg";

// Find non-deleted files
String query = "deleted:false";

// Find files not in system folders
String query = "NOT path:*Windows\\\\System32*";

Performance Optimization

Large Result Sets

When search returns more than 1 million items (MAX_SIZE_TO_SCORE = 1000000), scoring is automatically disabled to improve performance.
IPEDSearcher searcher = new IPEDSearcher(source, "*");
searcher.setNoScoring(true); // Manually disable scoring for faster results
SearchResult result = searcher.search();

Sorted Results

Use constructor with sort parameters:
// Sort by field
IPEDSearcher searcher = new IPEDSearcher(
    source, 
    "category:documents", 
    "modificationDate" // Sort field
);

SearchResult result = searcher.search();

Caching Searches

// Reuse searcher for multiple executions
IPEDSearcher searcher = new IPEDSearcher(source);

searcher.setQuery("type:pdf");
SearchResult pdfs = searcher.search();

searcher.setQuery("type:doc");
SearchResult docs = searcher.search();

QueryBuilder Class

Package: iped.engine.search
Source: iped-engine/src/main/java/iped/engine/search/QueryBuilder.java
The QueryBuilder class provides programmatic query construction.

Creating Queries

QueryBuilder(IIPEDSource ipedCase)
constructor
Creates a query builder for the specified case
getQuery(String queryText)
Query
Parses a query string and returns a Lucene Query object

Programmatic Query Building

import iped.engine.search.QueryBuilder;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.BooleanClause.Occur;

QueryBuilder builder = new QueryBuilder(source);

// Parse query string
Query q1 = builder.getQuery("type:pdf");

// Build query programmatically
BooleanQuery.Builder boolQuery = new BooleanQuery.Builder();
boolQuery.add(builder.getQuery("category:documents"), Occur.MUST);
boolQuery.add(builder.getQuery("deleted:true"), Occur.MUST);
Query complexQuery = boolQuery.build();

IPEDSearcher searcher = new IPEDSearcher(source, complexQuery);
SearchResult result = searcher.search();

Escaping Special Characters

import iped.engine.search.QueryBuilder;

// Escape special characters in user input
String userInput = "file[1].txt";
String escaped = QueryBuilder.escape(userInput);

String query = "name:" + escaped;
IPEDSearcher searcher = new IPEDSearcher(source, query);

Error Handling

import iped.exception.ParseException;
import iped.exception.QueryNodeException;
import java.io.IOException;

try {
    IPEDSearcher searcher = new IPEDSearcher(source, "type:pdf");
    SearchResult result = searcher.search();
    
    for (int id : result.getIds()) {
        IItem item = source.getItemByID(id);
        processItem(item);
    }
    
} catch (ParseException | QueryNodeException e) {
    System.err.println("Invalid query syntax: " + e.getMessage());
} catch (IOException e) {
    System.err.println("Index access error: " + e.getMessage());
}

Best Practices

Use Specific Fields: Query specific fields (e.g., name:report.pdf) instead of full-text searches when possible for better performance.
Disable Scoring: Use setNoScoring(true) when you only need to filter items and don’t need relevance ranking.
Wildcard Performance: Leading wildcards (e.g., *word) are expensive. Avoid them when possible or use them with other constraints.
Escape User Input: Always escape special characters in user-provided query strings using QueryBuilder.escape().

See Also

Build docs developers (and LLMs) love