Skip to main content
IPED includes a Web API that allows remote access to processed cases. You can search cases, retrieve document metadata and content, manage bookmarks, and more through HTTP REST endpoints.

Overview

The Web API provides programmatic access to:
  • Search across one or more cases
  • Retrieve document properties and metadata
  • Download document content and text
  • Get thumbnails and previews
  • Manage bookmarks and tags
  • List available data sources
The Web API accesses already processed cases. You cannot trigger processing through the API.

Starting the Web API Server

The Web API server must be started separately:
java -jar iped-web-api.jar --sources sources.json --port 8080

Configuration

Create a sources.json file listing your cases:
sources.json
[
  {
    "id": "case1",
    "path": "/data/cases/case1"
  },
  {
    "id": "case2",
    "path": "/data/cases/case2"
  }
]

Command-Line Options

OptionDescriptionDefault
--sourcesPath to sources JSON file or URLRequired
--portHTTP port to bind8080
--hostHost address to bind0.0.0.0

API Endpoints

Base URL

All endpoints are relative to: http://localhost:8080/

Authentication

The current implementation does not include authentication. Deploy behind a reverse proxy with authentication for production use.

Search Documents

Search for documents across all sources or within a specific source.
# Search all sources
curl "http://localhost:8080/search?q=password"

# Search specific source
curl "http://localhost:8080/search?q=password&sourceID=case1"
Query Parameters:
  • q (required) - Lucene query string
  • sourceID (optional) - Limit search to specific source
Response:
{
  "docs": [
    {
      "source": "case1",
      "id": 12345
    },
    {
      "source": "case2",
      "id": 67890
    }
  ]
}

Query Syntax

IPED uses Lucene query syntax:
# Search by content
curl "http://localhost:8080/search?q=bitcoin"

# Search by file type
curl "http://localhost:8080/search?q=type:pdf"

# Search by extension
curl "http://localhost:8080/search?q=ext:docx"

# Boolean operators
curl "http://localhost:8080/search?q=password+AND+admin"
curl "http://localhost:8080/search?q=type:image+NOT+category:system"

# Wildcards
curl "http://localhost:8080/search?q=pass*"

# Phrase search
curl "http://localhost:8080/search?q=\"social+security+number\""

# Date ranges
curl "http://localhost:8080/search?q=modified:[2023-01-01+TO+2023-12-31]"

Documents

Get Document Properties

Retrieve all metadata and properties for a document.
curl "http://localhost:8080/sources/case1/docs/12345"
Response:
{
  "source": "case1",
  "id": 12345,
  "luceneId": 98765,
  "properties": {
    "name": ["document.pdf"],
    "length": ["1048576"],
    "hash": ["abc123def456..."],
    "type": ["pdf"],
    "category": ["Documents"],
    "modified": ["2023-05-15T10:30:00Z"],
    "path": ["/evidence/docs/document.pdf"],
    "contentType": ["application/pdf"],
    "author": ["John Doe"],
    "title": ["Financial Report"]
  },
  "bookmarks": ["Important", "Financial"],
  "selected": true
}

Get Document Text

Retrieve the extracted text content of a document.
curl "http://localhost:8080/sources/case1/docs/12345/text"
Response: Plain text (UTF-8)

Get Document Content

Download the raw file content.
curl "http://localhost:8080/sources/case1/docs/12345/content" -o file.pdf
Response: Binary file content with appropriate Content-Type header

Get Document Thumbnail

Retrieve a thumbnail image of the document.
curl "http://localhost:8080/sources/case1/docs/12345/thumb" -o thumb.jpg
Response: JPEG image

Sources

List Sources

Get all available data sources.
curl "http://localhost:8080/sources"
Response:
{
  "data": [
    {
      "id": "case1",
      "path": "/data/cases/case1"
    },
    {
      "id": "case2",
      "path": "/data/cases/case2"
    }
  ]
}

Get Source Details

Get information about a specific source.
curl "http://localhost:8080/sources/case1"
Response:
{
  "id": "case1",
  "path": "/data/cases/case1"
}

Add Source

Dynamically add a new data source.
curl -X POST "http://localhost:8080/sources" \
  -H "Content-Type: application/json" \
  -d '{"id": "case3", "path": "/data/cases/case3"}'

Bookmarks

List Bookmarks

Get all bookmark names.
curl "http://localhost:8080/bookmarks"
Response:
{
  "data": [
    "Important",
    "Suspicious",
    "Evidence",
    "Review"
  ]
}

Get Bookmark Documents

Retrieve all documents with a specific bookmark.
curl "http://localhost:8080/bookmarks/Important"
Response:
{
  "docs": [
    {
      "source": "case1",
      "id": 12345
    },
    {
      "source": "case1",
      "id": 67890
    }
  ]
}

Create Bookmark

Create a new bookmark.
curl -X POST "http://localhost:8080/bookmarks/Evidence"

Add Documents to Bookmark

Tag documents with a bookmark.
curl -X PUT "http://localhost:8080/bookmarks/Evidence/add" \
  -H "Content-Type: application/json" \
  -d '[
    {"source": "case1", "id": 12345},
    {"source": "case1", "id": 67890}
  ]'

Remove Documents from Bookmark

Untag documents from a bookmark.
curl -X PUT "http://localhost:8080/bookmarks/Evidence/remove" \
  -H "Content-Type: application/json" \
  -d '[{"source": "case1", "id": 12345}]'

Delete Bookmark

Remove a bookmark entirely.
curl -X DELETE "http://localhost:8080/bookmarks/Evidence"

Rename Bookmark

Rename an existing bookmark.
curl -X PUT "http://localhost:8080/bookmarks/OldName/rename/NewName"

Categories

List available file categories.
curl "http://localhost:8080/categories"
Response:
{
  "data": [
    "Documents",
    "Images",
    "Videos",
    "Audio",
    "Emails",
    "Databases",
    "Executables"
  ]
}

Complete Example: Search and Export

Here’s a complete example that searches for documents and exports their text:
import requests
import os

# Configuration
API_BASE = "http://localhost:8080"
QUERY = "type:pdf AND password"
OUTPUT_DIR = "exported_docs"

# Create output directory
os.makedirs(OUTPUT_DIR, exist_ok=True)

# Search for documents
response = requests.get(
    f"{API_BASE}/search",
    params={"q": QUERY}
)
search_results = response.json()

print(f"Found {len(search_results['docs'])} documents")

# Process each document
for doc_ref in search_results['docs']:
    source = doc_ref['source']
    doc_id = doc_ref['id']
    
    # Get document properties
    doc_response = requests.get(
        f"{API_BASE}/sources/{source}/docs/{doc_id}"
    )
    doc = doc_response.json()
    
    name = doc['properties']['name'][0]
    print(f"Processing: {name}")
    
    # Get document text
    text_response = requests.get(
        f"{API_BASE}/sources/{source}/docs/{doc_id}/text"
    )
    text = text_response.text
    
    # Save to file
    safe_name = name.replace("/", "_").replace("\\", "_")
    output_path = os.path.join(OUTPUT_DIR, f"{doc_id}_{safe_name}.txt")
    
    with open(output_path, "w", encoding="utf-8") as f:
        f.write(f"Source: {source}\n")
        f.write(f"ID: {doc_id}\n")
        f.write(f"Name: {name}\n")
        f.write(f"Hash: {doc['properties'].get('hash', [''])[0]}\n")
        f.write("\n" + "="*80 + "\n\n")
        f.write(text)
    
    print(f"  Saved to: {output_path}")

print(f"\nExport complete! Files saved to {OUTPUT_DIR}")

Integration Examples

Python Client Library

Create a reusable client:
iped_client.py
import requests
from typing import List, Dict, Optional

class IPEDClient:
    def __init__(self, base_url: str = "http://localhost:8080"):
        self.base_url = base_url.rstrip("/")
    
    def search(self, query: str, source_id: Optional[str] = None) -> List[Dict]:
        """Search for documents"""
        params = {"q": query}
        if source_id:
            params["sourceID"] = source_id
        
        response = requests.get(f"{self.base_url}/search", params=params)
        response.raise_for_status()
        return response.json()["docs"]
    
    def get_document(self, source: str, doc_id: int) -> Dict:
        """Get document properties"""
        response = requests.get(
            f"{self.base_url}/sources/{source}/docs/{doc_id}"
        )
        response.raise_for_status()
        return response.json()
    
    def get_text(self, source: str, doc_id: int) -> str:
        """Get document text"""
        response = requests.get(
            f"{self.base_url}/sources/{source}/docs/{doc_id}/text"
        )
        response.raise_for_status()
        return response.text
    
    def get_content(self, source: str, doc_id: int) -> bytes:
        """Get document binary content"""
        response = requests.get(
            f"{self.base_url}/sources/{source}/docs/{doc_id}/content"
        )
        response.raise_for_status()
        return response.content
    
    def list_sources(self) -> List[Dict]:
        """List all sources"""
        response = requests.get(f"{self.base_url}/sources")
        response.raise_for_status()
        return response.json()["data"]
    
    def create_bookmark(self, name: str):
        """Create a bookmark"""
        response = requests.post(f"{self.base_url}/bookmarks/{name}")
        response.raise_for_status()
    
    def add_to_bookmark(self, bookmark: str, docs: List[Dict]):
        """Add documents to bookmark"""
        response = requests.put(
            f"{self.base_url}/bookmarks/{bookmark}/add",
            json=docs
        )
        response.raise_for_status()

# Usage
client = IPEDClient()
results = client.search("password")
for doc in results:
    text = client.get_text(doc["source"], doc["id"])
    print(text[:100])

Swagger Documentation

The API includes Swagger/OpenAPI documentation accessible at:
http://localhost:8080/swagger-ui/
This provides an interactive interface to explore and test all endpoints.

Next Steps

Scripting

Extend IPED with custom scripts

Multicase Analysis

Analyze multiple cases together

Build docs developers (and LLMs) love