System Architecture
Understand how the Predictive Maintenance System processes sensor data through a dual-model ML pipeline to predict equipment failures.Deployment Stack
The system is built on a modern, cloud-native stack optimized for real-time data processing:| Component | Technology | Hosting | URL |
|---|---|---|---|
| Frontend | React 18 + Vite | Vercel | predictive-maintenance-ten.vercel.app |
| Backend | FastAPI + Docker | Render | predictive-maintenance-uhlb.onrender.com |
| Database | InfluxDB 2.x | InfluxDB Cloud | AWS us-east-1 |
High-Level Architecture
Frontend (React + Vercel)
Technology Stack
React 18
Component-based UI with hooks for state management
Recharts
Real-time data visualization with 60s sliding windows
Vite
Lightning-fast build tool and dev server
Vercel
Global CDN deployment with automatic HTTPS
Key Features
- Real-time Charts: Multi-signal streaming with Voltage (V), Current (A), Vibration (g)
- Fixed Y-Axis Domains: 60s right-anchored sliding window for temporal stability
- Anomaly Visualization: Red shaded regions when risk ≠ LOW
- Health Score Ring: Color-coded 0-100 gauge (Green → Yellow → Orange → Red)
- Glassmorphism UI: Dark theme with translucent cards and backdrop blur
- Keep-Alive Heartbeat: 10-minute
/pingto prevent Render free-tier cold starts
Component Architecture
Backend (FastAPI + Render)
Technology Stack
Python 3.11+
Core runtime with type hints
FastAPI
Async REST API with OpenAPI docs
Pydantic
Schema validation and settings
scikit-learn
Isolation Forest ML models
ReportLab
PDF report generation
Docker
Containerized deployment
Data Processing Pipeline
The backend processes sensor data through six stages:Ingestion & Validation
Endpoint:
POST /ingestAll sensor data is validated against strict Pydantic schemas before processing.
Feature Engineering
Module:
backend/features/calculator.pyThe system computes two feature sets:Legacy Features (1Hz, 6 dimensions):voltage_rolling_mean_1h: Mean voltage over 1 hourcurrent_spike_count: Points > 3σ from local meanpower_factor_efficiency_score: (PF - 0.8) / 0.2 × 100vibration_intensity_rms: √(mean(vibration²))voltage_stability: |V - 230.0|power_vibration_ratio: vibration / (PF + 0.01)
- For each signal (voltage, current, power_factor, vibration):
mean,std,peak_to_peak,rms
- 4 signals × 4 stats = 16 features
The batch model achieves 99.6% F1-score by explicitly capturing variance—critical for detecting “Jitter” faults where averages look normal but standard deviation spikes.
ML Inference (Dual Models)
Modules:Output: Anomaly score (0.0 = healthy, 1.0 = critical)
backend/ml/detector.py(Legacy)backend/ml/batch_detector.py(Batch)
Health Assessment & Degradation Tracking
Module: Risk Classification:
backend/rules/assessor.pyThe system maintains a Cumulative Degradation Index (DI):| Health Score | Risk Level | Color | Typical RUL |
|---|---|---|---|
| 75-100 | LOW | Green | 30-60 days |
| 50-74 | MODERATE | Yellow | 10-29 days |
| 25-49 | HIGH | Orange | 1-9 days |
| 0-24 | CRITICAL | Red | < 1 day |
Explainability Engine
Module:
backend/rules/explainer.pyGenerates human-readable explanations:Every alert includes natural language explanations so operators understand why the system flagged an issue.
Persistence & Reporting
Storage: InfluxDB Cloud
sensor_data: Raw 100Hz measurementsfeatures: Computed 1Hz and batch featureshealth_reports: DI, health scores, risk levels
- Executive PDF (1-page): Health grade, DI%, RUL for plant managers
- Multi-sheet Excel: Summary, operator logs, raw sensor data for analysts
- Industrial Certificate (5-page): Feature contributions, ROI analysis, audit trail for engineers
Database (InfluxDB Cloud)
Why InfluxDB?
Time-Series Optimized
Purpose-built for sensor data with millisecond precision
Flux Query Language
Powerful aggregation and windowing functions
Data Retention Policies
Automatic downsampling and archival
Cloud-Native
Managed service with automatic backups
Data Model
ML Pipeline Deep Dive
Dual-Model Architecture
The system runs two Isolation Forest models in parallel:- Legacy Model (v2)
- Batch Model (v3)
Input: 1Hz aggregated features (6 dimensions)Features:
voltage_rolling_mean_1hcurrent_spike_countpower_factor_efficiency_scorevibration_intensity_rmsvoltage_stabilitypower_vibration_ratio
- Precision: 64.1%
- Recall: 100.0%
- F1-Score: 78.1%
- Limitation: Cannot detect variance-only faults (Jitter)
Training Workflow
Healthy Data Generation
The system generates synthetic sensor data matching real-world patterns:
- Voltage: 230V ± 5% (Indian grid)
- Current: 10-15A with power factor coupling
- Vibration: 0.05-0.20g with white noise
Feature Extraction
- 1Hz features: Computed from rolling windows
- Batch features: 100-point windows reduced to 16-D vectors
Fault Detection Types
The system detects four fault types:SPIKE - Voltage/Current Surges
SPIKE - Voltage/Current Surges
Pattern: Sharp transients in electrical signalsExample:
- Voltage: 230V → 280V (21% spike)
- Current: 12A → 45A (375% surge)
peak_to_peak and spike count featuresReal-World Causes: Grid instability, inrush current, capacitor switchingDRIFT - Gradual Degradation
DRIFT - Gradual Degradation
Pattern: Slow trend away from baselineExample:
- Power factor: 0.92 → 0.78 over 10 minutes
- Vibration: 0.15g → 0.35g gradual increase
JITTER - High Variance with Normal Mean
JITTER - High Variance with Normal Mean
Pattern: Stable average, high standard deviationExample:
- Vibration mean: 0.15g (normal)
- Vibration σ: 0.17g (5x healthy baseline of 0.03g)
std features)Real-World Causes: Loose mounting bolts, rotor imbalance, electrical noiseMIXED - Combination Faults
MIXED - Combination Faults
Pattern: Multiple simultaneous anomaliesExample:
- Voltage drift + current spikes + vibration jitter
Data Flow
Performance Specifications
| Operation | Latency | Notes |
|---|---|---|
| Batch Feature Extraction | 0.1ms | 100-point window → 16-D vector (NumPy) |
| ML Inference (Batch) | 1ms | IsolationForest on 16-D scaled input |
| ML Inference (Legacy) | 50ms | 6-feature Isolation Forest |
| Data Ingestion | 100 Hz | 100 raw points/second to InfluxDB |
| Server-Side Aggregation | 5ms | aggregateWindow(1s, mean) Flux query |
| PDF Generation | ~1.2s | 5-page Industrial Certificate |
| Dashboard Update | 3s poll | aggregated data delivery |
| API Response (p99) | 100ms | All endpoints |
Resilience Features
DI Hydration
Degradation Index recovered from InfluxDB on restart—state survives process crashes
Keep-Alive Heartbeat
Frontend pings
/ping every 10 minutes to prevent Render cold startsDocker Restart Policy
restart: unless-stopped ensures automatic recoveryHealth Checks
All containers have health probes for orchestrator monitoring
Next Steps
API Reference
Explore REST endpoints for sensor ingestion and reporting
Testing Guide
Run the 182-test suite and benchmark models
Deployment Guide
Deploy to production on Render + Vercel + InfluxDB Cloud
Feature Engineering
Deep dive into the 16-D batch feature extraction