Skip to main content

Overview

The Prediction API is a FastAPI service that provides real-time and batch prediction endpoints for student purchase probability. The service loads trained model artifacts, applies feature engineering, and returns predictions with configurable thresholds.

Architecture

  • Framework: FastAPI 1.2.0
  • Model Loading: joblib artifacts loaded at startup
  • Feature Engineering: Runtime feature transformation via FeatureConfig
  • Monitoring: In-memory tracking of predictions for drift detection
  • Logging: JSON Lines prediction log for audit trails

Request/Response Schemas

PredictRequest

class PredictRequest(BaseModel):
    student_country: str = Field(..., min_length=2, max_length=64)
    days_on_platform: int = Field(..., ge=0)
    minutes_watched: float = Field(..., ge=0)
    courses_started: int = Field(..., ge=0)
    practice_exams_started: int = Field(..., ge=0)
    practice_exams_passed: int = Field(..., ge=0)
    minutes_spent_on_exams: float = Field(..., ge=0)

PredictResponse

class PredictResponse(BaseModel):
    predicted_purchase_probability: float
    predicted_purchase: int

BatchPredictRequest

class BatchPredictRequest(BaseModel):
    records: List[PredictRequest] = Field(..., min_length=1)

BatchPredictResponse

class BatchPredictResponse(BaseModel):
    predictions: List[PredictResponse]

API Endpoints

POST /predict

Single prediction endpoint. Request Example:
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "student_country": "US",
    "days_on_platform": 45,
    "minutes_watched": 1200.5,
    "courses_started": 3,
    "practice_exams_started": 5,
    "practice_exams_passed": 4,
    "minutes_spent_on_exams": 180.0
  }'
Response Example:
{
  "predicted_purchase_probability": 0.7234,
  "predicted_purchase": 1
}

POST /batch_predict

Batch prediction endpoint for multiple records. Request Example:
curl -X POST http://localhost:8000/batch_predict \
  -H "Content-Type: application/json" \
  -d '{
    "records": [
      {
        "student_country": "US",
        "days_on_platform": 45,
        "minutes_watched": 1200.5,
        "courses_started": 3,
        "practice_exams_started": 5,
        "practice_exams_passed": 4,
        "minutes_spent_on_exams": 180.0
      },
      {
        "student_country": "CA",
        "days_on_platform": 30,
        "minutes_watched": 800.0,
        "courses_started": 2,
        "practice_exams_started": 3,
        "practice_exams_passed": 2,
        "minutes_spent_on_exams": 120.0
      }
    ]
  }'
Response Example:
{
  "predictions": [
    {
      "predicted_purchase_probability": 0.7234,
      "predicted_purchase": 1
    },
    {
      "predicted_purchase_probability": 0.4512,
      "predicted_purchase": 0
    }
  ]
}

GET /health

Health check endpoint. Response Schema:
class HealthResponse(BaseModel):
    ready: bool
    predictor_loaded: bool
    drift_baseline_loaded: bool
Example:
curl http://localhost:8000/health
{
  "ready": true,
  "predictor_loaded": true,
  "drift_baseline_loaded": true
}

Artifact Loading

The service loads artifacts during startup via the load_artifacts() function (src/api.py:195):
  1. Model: Loaded from artifacts/purchase_model.pkl using joblib
  2. Threshold: Read from artifacts/threshold.txt as float
  3. Feature Config: Built from config.yaml with epsilon and engagement weights
  4. Drift Baseline: Loaded from artifacts/drift_baseline.json (optional)
Artifacts must be generated by running training first:
python -m src.train

Feature Engineering

The API applies runtime feature engineering using add_engineered_features() from src/features.py. The FeatureConfig includes:
  • epsilon: Small constant to avoid division by zero
  • minutes_watched_weight: Weight for engagement score
  • days_on_platform_weight: Weight for engagement score
  • courses_started_weight: Weight for engagement score
Engineered features are computed from raw input before inference.

Validation

The service validates that practice_exams_passed cannot exceed practice_exams_started (src/api.py:243-248). Invalid requests return HTTP 422.

Prediction Logging

All predictions are logged to artifacts/prediction_log.jsonl in JSON Lines format:
{
  "timestamp_utc": "2026-03-04T10:15:30.123456+00:00",
  "threshold": 0.482,
  "predicted_purchase_probability": 0.7234,
  "predicted_purchase": 1,
  "features": {
    "student_country": "US",
    "days_on_platform": 45,
    "minutes_watched": 1200.5,
    "courses_started": 3,
    "practice_exams_started": 5,
    "practice_exams_passed": 4,
    "minutes_spent_on_exams": 180.0
  }
}
Log path is configurable via config.yaml under monitoring.prediction_log_file.

Starting the Service

Start the API using uvicorn:
uvicorn src.api:app --host 0.0.0.0 --port 8000
With auto-reload for development:
uvicorn src.api:app --reload --host 0.0.0.0 --port 8000

Error Handling

  • 503 Service Unavailable: Model artifacts not loaded
  • 422 Unprocessable Entity: Validation errors (e.g., invalid field values)
  • 500 Internal Server Error: Unexpected errors during prediction

Build docs developers (and LLMs) love