Upload Endpoint

POST /upload

Upload and process forensic artifacts with streaming I/O for files up to 6GB+. The endpoint automatically routes to specialized parsers based on file extension and artifact type.

Request Format

Content-Type: multipart/form-data

file

required

Forensic artifact file. Supports: EVTX, MFT, CSV, XLSX, JSON, Parquet, SQLite, Plist, ZIP, PSList, TXT, LOG

artifact_type

string

required

Type of forensic artifact. Used for specialized parsing.Supported values:

evtx: Windows Event Logs
mft: NTFS Master File Table
csv: Generic CSV/TSV
xlsx: Excel spreadsheet
json: JSON/JSONL/NDJSON
sqlite: SQLite database
plist: macOS Property List
generic: Auto-detect format

case_id

string

Case identifier for multi-file investigations. Registers file in case database.

phase_id

string

Investigation phase ID (e.g., “initial_triage”, “deep_dive”). Groups files by analysis stage.

Response Format

status

string

required

Always "success" on successful upload

message

string

Human-readable status message

data_url

string

API endpoint to query processed data: /api/data/{csv_filename}

csv_filename

string

Internal CSV filename for normalized timeline data

xlsx_filename

string

Excel export filename (only for forensic artifacts like EVTX/MFT)

processed_records

integer

Number of timeline events extracted

file_category

string

Classification: "forensic" (EVTX/MFT) or "generic" (CSV/JSON)

original_filename

string

Original uploaded filename

file_id

string

Database ID if case_id was provided

chain_of_custody

object

Cryptographic verification metadata

sha256

string

SHA256 hash computed during streaming upload (zero extra I/O)

file_size_bytes

integer

Exact file size in bytes

original_filename

string

Original filename for audit trail

Streaming Upload Architecture

Chronos-DFIR uses zero-copy streaming to handle large files without memory exhaustion:

Chunked upload: File read in 8KB chunks
Simultaneous hashing: SHA256 computed during upload (no extra read)
Disk write: Direct write to chronos_uploads/ directory
Lazy parsing: Files are scanned (not loaded) using Polars scan_csv()

A 6GB EVTX file is processed with ~200MB peak RAM usage.

Examples

curl -X POST http://localhost:8000/upload \
  -F "[email protected]" \
  -F "artifact_type=evtx" \
  -F "case_id=CASE-2024-001" \
  -F "phase_id=initial_triage"

Response Examples

{
  "status": "success",
  "message": "File processed successfully",
  "data_url": "/api/data/Security_evtx_1704067200.csv",
  "processed_records": 42084,
  "csv_filename": "Security_evtx_1704067200.csv",
  "xlsx_filename": "Security_evtx_1704067200.xlsx",
  "original_filename": "Security.evtx",
  "file_category": "forensic",
  "file_id": "file_abc123",
  "chain_of_custody": {
    "sha256": "a3f5b2c1d4e6f7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0f1a2",
    "file_size_bytes": 6442450944,
    "original_filename": "Security.evtx"
  }
}

Artifact Type Routing

Generic Artifacts

Files with these extensions are processed as generic tabular data:

.csv, .tsv → Polars scan_csv
.xlsx → Polars read_excel
.json, .jsonl, .ndjson → Polars read_json
.parquet → Polars scan_parquet
.db, .sqlite, .sqlite3 → SQLite cursor + Polars DataFrame
.plist → plistlib + Polars DataFrame
.pslist, .txt, .log → Whitespace regex parser
.zip → Automatic extraction

Processing:

Ingest using format-specific parser
Normalize column names (remove special characters)
Detect time columns (hierarchy: Time > timestamp > datetime)
Save to chronos_output/import_{filename}_{timestamp}.csv

Forensic Artifacts

Files with specialized forensic formats:

.evtx → Windows Event Logs (uses evtx_dump + timeline_skill)
.mft → NTFS Master File Table (binary parser with $STANDARD_INFORMATION)

Processing:

Call generate_unified_timeline() from timeline_skill.py
Parse binary structures (EVTX XML, MFT records)
Extract timestamps (Creation, Modification, Access, Entry Modified)
Generate CSV + XLSX outputs
Return forensic-grade metadata

MFT parsing uses real FILETIME values from $STANDARD_INFORMATION attribute. Never fabricates timestamps.

Chain of Custody

The upload endpoint computes SHA256 hash during streaming upload with zero extra I/O:

sha256 = hashlib.sha256()
with open(file_path, "wb") as buffer:
    while chunk := file.file.read(8192):
        sha256.update(chunk)  # Hash during upload
        buffer.write(chunk)   # Write to disk
file_hash = sha256.hexdigest()

Benefits:

Zero performance penalty (single pass)
Forensic integrity (tamper detection)
Audit trail (original filename + size + hash)
Court admissibility (cryptographic verification)

Case Management

When case_id is provided, the file is registered in the case database (DuckDB):

INSERT INTO case_files (
  case_id, phase_id, original_filename, processed_filename,
  sha256, file_size, file_category, row_count, upload_timestamp
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)

Schema:

case_id: Investigation identifier
phase_id: Analysis phase (triage, deep_dive, reporting)
sha256: Chain of custody hash
file_category: forensic or generic
row_count: Number of timeline events

Multi-file correlation (cross-source pivoting) will use case_id in future releases.

Error Handling

Memory Exhaustion

If normalization fails due to OOM:

Raw file is copied to output directory
Response includes processed_records: "Unknown"
File is still queryable (lazy evaluation)

Parsing Failures

If artifact parsing fails:

Error logged to console
Raw file copied as fallback
HTTP 500 returned with error details

Unsupported Formats

If file extension is not recognized:

{
  "error": "Unsupported file format: .docx"
}

Performance Benchmarks

File Size	Format	Upload Time	Peak RAM	Processed Records
6.0 GB	EVTX	4m 12s	210 MB	850,000
2.5 GB	MFT	1m 45s	180 MB	1,200,000
500 MB	CSV	22s	90 MB	2,000,000
1.2 GB	JSON	38s	150 MB	450,000

Tested on Apple M4 Pro with 48GB RAM

Next Steps

After uploading a file:

Query data: Use /api/data/{csv_filename} endpoint
Generate histogram: Call /api/histogram/{csv_filename}
Run forensic report: POST to /api/forensic_report
Export results: Use /api/export_filtered for CSV/XLSX/JSON

Analysis Endpoints

Query timelines, generate histograms, and run forensic analysis

Endpoints

Engine Modules

POST /upload

Request Format

Response Format

Streaming Upload Architecture

Examples

Response Examples

Artifact Type Routing

Generic Artifacts

Forensic Artifacts

Chain of Custody

Case Management

Error Handling

Memory Exhaustion

Parsing Failures

Unsupported Formats

Performance Benchmarks

Next Steps

Analysis Endpoints

Build docs developers (and LLMs) love

Endpoints

Engine Modules

​POST /upload

​Request Format

​Response Format

​Streaming Upload Architecture

​Examples

​Response Examples

​Artifact Type Routing

​Generic Artifacts

​Forensic Artifacts

​Chain of Custody

​Case Management

​Error Handling

​Memory Exhaustion

​Parsing Failures

​Unsupported Formats

​Performance Benchmarks

​Next Steps

Analysis Endpoints

Build docs developers (and LLMs) love

POST /upload

Request Format

Response Format

Streaming Upload Architecture

Examples

Response Examples

Artifact Type Routing

Generic Artifacts

Forensic Artifacts

Chain of Custody

Case Management

Error Handling

Memory Exhaustion

Parsing Failures

Unsupported Formats

Performance Benchmarks

Next Steps