IPED includes a Web API that allows remote access to processed cases. You can search cases, retrieve document metadata and content, manage bookmarks, and more through HTTP REST endpoints.
Overview
The Web API provides programmatic access to:
Search across one or more cases
Retrieve document properties and metadata
Download document content and text
Get thumbnails and previews
Manage bookmarks and tags
List available data sources
The Web API accesses already processed cases. You cannot trigger processing through the API.
Starting the Web API Server
The Web API server must be started separately:
java -jar iped-web-api.jar --sources sources.json --port 8080
Configuration
Create a sources.json file listing your cases:
[
{
"id" : "case1" ,
"path" : "/data/cases/case1"
},
{
"id" : "case2" ,
"path" : "/data/cases/case2"
}
]
Command-Line Options
Option Description Default --sourcesPath to sources JSON file or URL Required --portHTTP port to bind 8080 --hostHost address to bind 0.0.0.0
API Endpoints
Base URL
All endpoints are relative to: http://localhost:8080/
Authentication
The current implementation does not include authentication. Deploy behind a reverse proxy with authentication for production use.
Search
Search Documents
Search for documents across all sources or within a specific source.
# Search all sources
curl "http://localhost:8080/search?q=password"
# Search specific source
curl "http://localhost:8080/search?q=password&sourceID=case1"
Query Parameters:
q (required) - Lucene query string
sourceID (optional) - Limit search to specific source
Response:
{
"docs" : [
{
"source" : "case1" ,
"id" : 12345
},
{
"source" : "case2" ,
"id" : 67890
}
]
}
Query Syntax
IPED uses Lucene query syntax:
# Search by content
curl "http://localhost:8080/search?q=bitcoin"
# Search by file type
curl "http://localhost:8080/search?q=type:pdf"
# Search by extension
curl "http://localhost:8080/search?q=ext:docx"
# Boolean operators
curl "http://localhost:8080/search?q=password+AND+admin"
curl "http://localhost:8080/search?q=type:image+NOT+category:system"
# Wildcards
curl "http://localhost:8080/search?q=pass*"
# Phrase search
curl "http://localhost:8080/search?q= \" social+security+number \" "
# Date ranges
curl "http://localhost:8080/search?q=modified:[2023-01-01+TO+2023-12-31]"
Documents
Get Document Properties
Retrieve all metadata and properties for a document.
curl "http://localhost:8080/sources/case1/docs/12345"
Response:
{
"source" : "case1" ,
"id" : 12345 ,
"luceneId" : 98765 ,
"properties" : {
"name" : [ "document.pdf" ],
"length" : [ "1048576" ],
"hash" : [ "abc123def456..." ],
"type" : [ "pdf" ],
"category" : [ "Documents" ],
"modified" : [ "2023-05-15T10:30:00Z" ],
"path" : [ "/evidence/docs/document.pdf" ],
"contentType" : [ "application/pdf" ],
"author" : [ "John Doe" ],
"title" : [ "Financial Report" ]
},
"bookmarks" : [ "Important" , "Financial" ],
"selected" : true
}
Get Document Text
Retrieve the extracted text content of a document.
curl "http://localhost:8080/sources/case1/docs/12345/text"
Response: Plain text (UTF-8)
Get Document Content
Download the raw file content.
curl "http://localhost:8080/sources/case1/docs/12345/content" -o file.pdf
Response: Binary file content with appropriate Content-Type header
Get Document Thumbnail
Retrieve a thumbnail image of the document.
curl "http://localhost:8080/sources/case1/docs/12345/thumb" -o thumb.jpg
Response: JPEG image
Sources
List Sources
Get all available data sources.
curl "http://localhost:8080/sources"
Response:
{
"data" : [
{
"id" : "case1" ,
"path" : "/data/cases/case1"
},
{
"id" : "case2" ,
"path" : "/data/cases/case2"
}
]
}
Get Source Details
Get information about a specific source.
curl "http://localhost:8080/sources/case1"
Response:
{
"id" : "case1" ,
"path" : "/data/cases/case1"
}
Add Source
Dynamically add a new data source.
curl -X POST "http://localhost:8080/sources" \
-H "Content-Type: application/json" \
-d '{"id": "case3", "path": "/data/cases/case3"}'
Bookmarks
List Bookmarks
Get all bookmark names.
curl "http://localhost:8080/bookmarks"
Response:
{
"data" : [
"Important" ,
"Suspicious" ,
"Evidence" ,
"Review"
]
}
Get Bookmark Documents
Retrieve all documents with a specific bookmark.
curl "http://localhost:8080/bookmarks/Important"
Response:
{
"docs" : [
{
"source" : "case1" ,
"id" : 12345
},
{
"source" : "case1" ,
"id" : 67890
}
]
}
Create Bookmark
Create a new bookmark.
curl -X POST "http://localhost:8080/bookmarks/Evidence"
Add Documents to Bookmark
Tag documents with a bookmark.
curl -X PUT "http://localhost:8080/bookmarks/Evidence/add" \
-H "Content-Type: application/json" \
-d '[
{"source": "case1", "id": 12345},
{"source": "case1", "id": 67890}
]'
Remove Documents from Bookmark
Untag documents from a bookmark.
curl -X PUT "http://localhost:8080/bookmarks/Evidence/remove" \
-H "Content-Type: application/json" \
-d '[{"source": "case1", "id": 12345}]'
Delete Bookmark
Remove a bookmark entirely.
curl -X DELETE "http://localhost:8080/bookmarks/Evidence"
Rename Bookmark
Rename an existing bookmark.
curl -X PUT "http://localhost:8080/bookmarks/OldName/rename/NewName"
Categories
List available file categories.
curl "http://localhost:8080/categories"
Response:
{
"data" : [
"Documents" ,
"Images" ,
"Videos" ,
"Audio" ,
"Emails" ,
"Databases" ,
"Executables"
]
}
Complete Example: Search and Export
Here’s a complete example that searches for documents and exports their text:
import requests
import os
# Configuration
API_BASE = "http://localhost:8080"
QUERY = "type:pdf AND password"
OUTPUT_DIR = "exported_docs"
# Create output directory
os.makedirs( OUTPUT_DIR , exist_ok = True )
# Search for documents
response = requests.get(
f " { API_BASE } /search" ,
params = { "q" : QUERY }
)
search_results = response.json()
print ( f "Found { len (search_results[ 'docs' ]) } documents" )
# Process each document
for doc_ref in search_results[ 'docs' ]:
source = doc_ref[ 'source' ]
doc_id = doc_ref[ 'id' ]
# Get document properties
doc_response = requests.get(
f " { API_BASE } /sources/ { source } /docs/ { doc_id } "
)
doc = doc_response.json()
name = doc[ 'properties' ][ 'name' ][ 0 ]
print ( f "Processing: { name } " )
# Get document text
text_response = requests.get(
f " { API_BASE } /sources/ { source } /docs/ { doc_id } /text"
)
text = text_response.text
# Save to file
safe_name = name.replace( "/" , "_" ).replace( " \\ " , "_" )
output_path = os.path.join( OUTPUT_DIR , f " { doc_id } _ { safe_name } .txt" )
with open (output_path, "w" , encoding = "utf-8" ) as f:
f.write( f "Source: { source } \n " )
f.write( f "ID: { doc_id } \n " )
f.write( f "Name: { name } \n " )
f.write( f "Hash: { doc[ 'properties' ].get( 'hash' , [ '' ])[ 0 ] } \n " )
f.write( " \n " + "=" * 80 + " \n\n " )
f.write(text)
print ( f " Saved to: { output_path } " )
print ( f " \n Export complete! Files saved to { OUTPUT_DIR } " )
Integration Examples
Python Client Library
Create a reusable client:
import requests
from typing import List, Dict, Optional
class IPEDClient :
def __init__ ( self , base_url : str = "http://localhost:8080" ):
self .base_url = base_url.rstrip( "/" )
def search ( self , query : str , source_id : Optional[ str ] = None ) -> List[Dict]:
"""Search for documents"""
params = { "q" : query}
if source_id:
params[ "sourceID" ] = source_id
response = requests.get( f " { self .base_url } /search" , params = params)
response.raise_for_status()
return response.json()[ "docs" ]
def get_document ( self , source : str , doc_id : int ) -> Dict:
"""Get document properties"""
response = requests.get(
f " { self .base_url } /sources/ { source } /docs/ { doc_id } "
)
response.raise_for_status()
return response.json()
def get_text ( self , source : str , doc_id : int ) -> str :
"""Get document text"""
response = requests.get(
f " { self .base_url } /sources/ { source } /docs/ { doc_id } /text"
)
response.raise_for_status()
return response.text
def get_content ( self , source : str , doc_id : int ) -> bytes :
"""Get document binary content"""
response = requests.get(
f " { self .base_url } /sources/ { source } /docs/ { doc_id } /content"
)
response.raise_for_status()
return response.content
def list_sources ( self ) -> List[Dict]:
"""List all sources"""
response = requests.get( f " { self .base_url } /sources" )
response.raise_for_status()
return response.json()[ "data" ]
def create_bookmark ( self , name : str ):
"""Create a bookmark"""
response = requests.post( f " { self .base_url } /bookmarks/ { name } " )
response.raise_for_status()
def add_to_bookmark ( self , bookmark : str , docs : List[Dict]):
"""Add documents to bookmark"""
response = requests.put(
f " { self .base_url } /bookmarks/ { bookmark } /add" ,
json = docs
)
response.raise_for_status()
# Usage
client = IPEDClient()
results = client.search( "password" )
for doc in results:
text = client.get_text(doc[ "source" ], doc[ "id" ])
print (text[: 100 ])
Swagger Documentation
The API includes Swagger/OpenAPI documentation accessible at:
http://localhost:8080/swagger-ui/
This provides an interactive interface to explore and test all endpoints.
Next Steps
Scripting Extend IPED with custom scripts
Multicase Analysis Analyze multiple cases together