Overview
The monitoring system tracks prediction patterns in real-time to detect feature distribution drift and prediction rate shifts. When drift exceeds configured thresholds, the system recommends retraining the model.Architecture
- In-Memory Tracking: Cumulative feature sums and prediction counts
- Thread-Safe: Lock-protected updates during concurrent requests
- Baseline Comparison: Z-score computation against training distribution
- Dual Triggers: Feature drift and prediction rate shift detection
Drift Detection Endpoint
GET /monitoring/drift
Returns current drift status and retraining recommendation. Response Schema:Drift Detection Logic
Implemented in_compute_drift_status() (src/api.py:91-172).
Step 1: Load Baseline
Drift baseline loaded fromartifacts/drift_baseline.json:
python -m src.train.
Step 2: Compute Current Statistics
For each prediction, the API updates in-memory tracking (src/api.py:256-260):Step 3: Calculate Z-Scores
For each feature, compute absolute Z-score (src/api.py:130-141):abs_z >= z_threshold are flagged as drifted.
Step 4: Compute Prediction Rate Shift
Step 5: Determine Retraining Need
Retraining triggered if (src/api.py:146-161):- Insufficient samples: Wait for
drift_min_samples - Feature drift:
len(drifted_features) >= drift_min_features - Prediction rate shift:
class_shift >= class_rate_shift_threshold
Configuration
Monitoring parameters inconfig.yaml:
Parameters
drift_min_samples(default: 100): Minimum predictions before drift detectiondrift_zscore_threshold(default: 3.0): Z-score threshold for feature driftdrift_min_features(default: 2): Minimum drifted features to trigger retrainingclass_rate_shift_threshold(default: 0.10): Absolute prediction rate change thresholdprediction_log_file: Path to JSON Lines prediction log
Drift Baseline Structure
artifacts/drift_baseline.json contains:
python -m src.train)
Statistics Computed: Mean, std, min, max for all numeric features in training set
Monitoring States
Baseline Not Loaded
drift_baseline.json missing or failed to load
Action: Run training to generate baseline
No Predictions Observed
/predict or /batch_predict
Insufficient Samples
samples_observed < drift_min_samples
Action: Continue sending predictions until minimum threshold reached
Below Threshold
Feature Distribution Drift
len(drifted_features) >= drift_min_features
Action: Trigger retraining pipeline
Prediction Rate Shift
abs(predicted_positive_rate - training_positive_rate) >= class_rate_shift_threshold
Action: Trigger retraining pipeline
Prediction Logging
All predictions logged toartifacts/prediction_log.jsonl (src/api.py:175-192):
timestamp_utc: ISO 8601 timestamp with timezonethreshold: Model decision thresholdpredicted_purchase_probability: Raw probability scorepredicted_purchase: Binary prediction (0 or 1)features: Raw input features
- Offline drift analysis
- Model debugging
- Audit trails
- Retraining data collection
Retraining Workflow
- Monitor drift: Poll
/monitoring/driftperiodically - Detect trigger:
should_retrain: true - Collect samples: Read
prediction_log.jsonlfor recent predictions - Retrain model: Run training pipeline with updated data
- Generate baseline: New
drift_baseline.jsoncreated - Reload API: Restart service or implement hot-reload
- Reset monitoring: In-memory stats cleared on restart
Alerting Integration
Integrate drift endpoint with monitoring systems:Thread Safety
Monitoring state protected by thread lock (src/api.py:79):Limitations
- In-memory state: Monitoring resets on API restart (not persistent)
- Global baseline: Single baseline for all models (no A/B test support)
- Cumulative tracking: No sliding window or time-based decay
- Simple Z-score: Assumes Gaussian distributions
- No covariate shift detection: Only marginal feature distributions tracked
Best Practices
- Set appropriate thresholds: Tune
drift_zscore_thresholdbased on false positive rate - Monitor continuously: Poll
/monitoring/driftevery 15-30 minutes - Review drifted features: Investigate why specific features drift
- Collect ground truth: Track actual purchase outcomes for retraining labels
- Version baselines: Store
drift_baseline.jsonwith model artifacts - Test retraining: Validate new models before production deployment
Related
- Prediction API - Prediction endpoints and artifact loading
- Streaming Pipelines - Real-time inference orchestration