Skip to main content

Overview

FastAPI backend that serves videos, provides predictions, and handles sorting videos into folders. Includes endpoints for listing videos, managing folders, sorting videos, and triggering model retraining. Location: source/server.py Server: FastAPI with Uvicorn Default Port: 8000

Configuration Constants

DATA_DIR
Path
default:"data/Favorites/videos"
Directory containing videos and category subfolders.
ARTIFACTS_DIR
Path
default:"artifacts"
Directory containing predictions and trained models.
INDEX_HTML
Path
default:"index.html"
Path to the frontend HTML file.

Pydantic Models

SortRequest

Request body schema for sorting a video into a folder.
class SortRequest(BaseModel):
    filename: str
    folder: str
filename
string
required
Video filename (must match pattern \d+\.mp4, e.g., 7234567890123456.mp4)
folder
string
required
Target folder name (must exist as a subfolder in DATA_DIR)
Example:
{
  "filename": "7234567890123456.mp4",
  "folder": "soccer"
}

Functions

get_folders()

Returns list of category folders with video counts. Returns: list[dict] - List of folders with structure:
[
  {"name": "soccer", "count": 82},
  {"name": "food", "count": 55},
  {"name": "funny", "count": 5}
]

load_predictions()

Startup event handler that loads predictions from artifacts/predictions.json into memory. Populates the global predictions dictionary mapping filename → prediction data.

_run_retrain()

Background function that runs the full ML pipeline in sequence:
  1. extract_features.py - Extract features from all videos
  2. train.py - Train classifier on labeled data
  3. predict.py - Generate predictions for unsorted videos
Runs in a separate thread to avoid blocking API requests. Updates retrain_status global state. Timeout: 900 seconds (15 minutes) per script

API Endpoints

GET /

Serves the main HTML interface. Response: HTML file (index.html)
curl http://localhost:8000/

GET /api/videos

Lists all unsorted videos (in root directory) with their predictions. Response:
videos
array
required
Array of unsorted videos with prediction data
total
integer
required
Total number of unsorted videos
Video Object Schema:
filename
string
required
Video filename
predicted_folder
string | null
required
Predicted category folder, or null if no prediction available
confidence
float
required
Prediction confidence score (0-1)
top_predictions
array
required
Array of top predictions with folder names and confidence scores
Example Response:
{
  "videos": [
    {
      "filename": "7234567890123456.mp4",
      "predicted_folder": "soccer",
      "confidence": 0.87,
      "top_predictions": [
        {"folder": "soccer", "confidence": 0.87},
        {"folder": "food", "confidence": 0.10}
      ]
    },
    {
      "filename": "7234567890123457.mp4",
      "predicted_folder": null,
      "confidence": 0,
      "top_predictions": []
    }
  ],
  "total": 2
}
cURL Example:
curl http://localhost:8000/api/videos

GET /api/folders

Lists all category folders with video counts. Response:
folders
array
required
Array of folder objects
Folder Object Schema:
name
string
required
Folder name
count
integer
required
Number of videos in this folder
Example Response:
{
  "folders": [
    {"name": "funny", "count": 5},
    {"name": "food", "count": 55},
    {"name": "soccer", "count": 82}
  ]
}
cURL Example:
curl http://localhost:8000/api/folders

POST /api/sort

Moves a video from root directory to a category folder. Request Body: SortRequest Response:
success
boolean
required
Whether the sort operation succeeded
filename
string
required
Filename that was sorted
folder
string
required
Folder the video was moved to
folders
array
required
Updated list of all folders with new counts
Example Response:
{
  "success": true,
  "filename": "7234567890123456.mp4",
  "folder": "soccer",
  "folders": [
    {"name": "funny", "count": 5},
    {"name": "food", "count": 55},
    {"name": "soccer", "count": 83}
  ]
}
Error Responses:
  • 400 Bad Request - Invalid filename format or folder name
  • 404 Not Found - Video file not found (may already be sorted)
  • 409 Conflict - File already exists in target folder
cURL Example:
curl -X POST http://localhost:8000/api/sort \
  -H "Content-Type: application/json" \
  -d '{"filename": "7234567890123456.mp4", "folder": "soccer"}'
Security:
  • Validates filename matches pattern \d+\.mp4
  • Prevents path traversal (rejects folders containing .. or /)
  • Checks that destination folder exists and is a directory
  • Validates source file exists and is a file (not already moved)

POST /api/retrain

Triggers a full model retraining pipeline in the background. Request Body: None Response:
status
string
required
Either "started" if retraining began, or "already_running" if a retrain is in progress
Example Response:
{
  "status": "started"
}
cURL Example:
curl -X POST http://localhost:8000/api/retrain
Process:
  1. Checks if retraining is already in progress
  2. Launches background thread to run pipeline
  3. Returns immediately (doesn’t wait for completion)
  4. Pipeline runs: extract_features.pytrain.pypredict.py
  5. Reloads predictions into memory when complete

GET /api/retrain/status

Checks the status of the retraining pipeline. Response:
running
boolean
required
Whether retraining is currently in progress
last_result
string | null
required
Result of last retrain attempt:
  • "success" - Completed successfully
  • "Failed at [script]: [error]" - Failed with error message
  • null - No retrain has completed yet
Example Response:
{
  "running": false,
  "last_result": "success"
}
cURL Example:
curl http://localhost:8000/api/retrain/status
Polling Pattern: Frontend can poll this endpoint every few seconds while running: true to show progress.

GET /videos/

Serves a video file for playback. Path Parameters:
filename
string
required
Video filename (must match pattern \d+\.mp4)
Response: Video file with Content-Type: video/mp4 File Lookup:
  1. First checks root directory (unsorted videos)
  2. If not found, searches all category subfolders
  3. Returns 404 if not found anywhere
Example:
curl http://localhost:8000/videos/7234567890123456.mp4 --output video.mp4
Browser Usage:
<video src="http://localhost:8000/videos/7234567890123456.mp4" controls></video>
Error Response:
  • 400 Bad Request - Invalid filename format
  • 404 Not Found - Video file not found

Running the Server

Development

python server.py
Starts Uvicorn server on http://0.0.0.0:8000

Production

uvicorn server:app --host 0.0.0.0 --port 8000 --workers 4

With Auto-Reload

uvicorn server:app --reload

State Management

The server maintains two global state variables:

predictions

predictions: dict[str, dict] = {}
Dictionary mapping video filenames to prediction data. Loaded from artifacts/predictions.json at startup and after retraining.

retrain_status

retrain_status: dict = {
    "running": False,
    "last_result": None
}
Tracks retraining pipeline status:
  • running: Boolean indicating if retrain is in progress
  • last_result: String with success/failure message from last retrain

Dependencies

Required Python packages:
  • fastapi - Web framework
  • uvicorn - ASGI server
  • pydantic - Request/response validation
The server also depends on the ML pipeline scripts:
  • extract_features.py
  • train.py
  • predict.py

Architecture Notes

Thread Safety: The retraining pipeline runs in a daemon thread to avoid blocking API requests. Only one retrain can run at a time. File System: All file operations use shutil.move() for atomic moves. Path traversal is prevented through validation. Caching: Predictions are cached in memory and only reloaded after successful retraining. Error Handling: Invalid filenames, missing files, and folder conflicts return appropriate HTTP error codes.

Build docs developers (and LLMs) love