The data generator is a CLI tool that simulates realistic sensor data from industrial assets like motors, pumps, and compressors. It supports both healthy and faulty operation modes for comprehensive testing.
Overview
The generator creates synthetic sensor readings and sends them to the /api/v1/data/simple ingest endpoint. This is useful for:
Testing anomaly detection models
Calibrating baseline thresholds
Demonstrating fault scenarios
Load testing the ingestion pipeline
Basic Usage
Start the backend server
Ensure the FastAPI backend is running before generating data: uvicorn backend.api.main:app --reload
The generator will POST to http://localhost:8000/api/v1/data/simple by default.
Run the generator
Execute the script from the project root: python scripts/generate_data.py --asset_id motor_01 --duration 60 --healthy
Verify data ingestion
Check the terminal output for successful events: [OK] Sent event 3f5a8b2c... | V=230.5V, I=15.2A, PF=0.92, Vib=0.150g
Command-Line Arguments
Asset Identification
Unique identifier for the asset being simulated.
Default: Motor-01
Example: motor_01, pump_alpha, compressor_3
python scripts/generate_data.py --asset_id pump_alpha
Duration
How long to run the generator (in seconds).
Default: 60
Range: Any positive integer
python scripts/generate_data.py --duration 300 # Run for 5 minutes
Interval
Time between consecutive sensor readings.
Default: 1.0 (1 second)
Range: Any positive float
python scripts/generate_data.py --interval 0.5 # 2 readings per second
Operating Mode
Healthy Mode (Normal Operation)
Generates normal operating sensor values:
Voltage: 230V ± 3V (Indian Grid standard)
Current: 15A ± 1A
Power Factor: 0.88 - 0.95 (good efficiency)
Vibration: 0.15g ± 0.02g (low vibration)
python scripts/generate_data.py --healthy --duration 120
Faulty Mode (Anomalous Operation)
Injects extreme anomalies to force CRITICAL risk states. Fault types include:
Voltage Spike 320V ± 20V (+90V from normal)
Vibration Drift 3.0g ± 0.5g (20x normal levels)
Power Factor Drop 0.35 - 0.50 (severe degradation)
Catastrophic Failure All signals fail simultaneously
python scripts/generate_data.py --faulty --duration 30
The --healthy flag overrides --faulty if both are specified. This ensures explicit control over the operation mode.
Payload Structure
Each event is sent as a JSON payload to the /api/v1/data/simple endpoint:
{
"asset_id" : "motor_01" ,
"voltage_v" : 230.5 ,
"current_a" : 15.2 ,
"power_factor" : 0.92 ,
"vibration_g" : 0.1523 ,
"is_faulty" : false
}
Console Output
Successful events display real-time metrics:
============================================================
PREDICTIVE MAINTENANCE - DATA GENERATOR
============================================================
Asset ID: motor_01
Duration: 60 seconds
Interval: 1.0 seconds
Mode: HEALTHY (normal operation)
Endpoint: http://localhost:8000/api/v1/data/simple
============================================================
[OK] Sent event 3f5a8b2c... | V=230.5V, I=15.2A, PF=0.92, Vib=0.150g
[OK] Sent event 7a9c4d1e... | V=229.8V, I=14.9A, PF=0.91, Vib=0.148g
[OK] Sent event 2b8f3a5d... | V=231.2V, I=15.4A, PF=0.93, Vib=0.152g
============================================================
[COMPLETE] Sent 60 events, 0 failed
============================================================
Faulty events are marked with [FAULT]:
[OK] Sent event 4e7c9a2b... | V=315.2V, I=15.1A, PF=0.91, Vib=2.850g [FAULT]
Common Recipes
Initial System Calibration
Generate 5 minutes of healthy data for baseline training:
python scripts/generate_data.py \
--asset_id motor_01 \
--duration 300 \
--interval 1.0 \
--healthy
Fault Scenario Testing
Simulate a brief fault condition:
python scripts/generate_data.py \
--asset_id motor_01 \
--duration 10 \
--interval 0.5 \
--faulty
High-Frequency Monitoring
Generate data at 10Hz (100ms intervals) for stress testing:
python scripts/generate_data.py \
--asset_id motor_01 \
--duration 60 \
--interval 0.1 \
--healthy
Multi-Asset Simulation
Run multiple generators in parallel (separate terminals):
# Terminal 1 - Motor
python scripts/generate_data.py --asset_id motor_01 --healthy
# Terminal 2 - Pump
python scripts/generate_data.py --asset_id pump_alpha --healthy
# Terminal 3 - Compressor
python scripts/generate_data.py --asset_id compressor_3 --faulty
Environment Configuration
Custom API Endpoint
Override the default API URL using an environment variable:
Linux/Mac
Windows (PowerShell)
Windows (CMD)
export API_URL = https :// predictive-maintenance-uhlb . onrender . com
python scripts/generate_data.py --asset_id motor_01 --healthy
Connection Error Handling
If the backend is not running, you’ll see:
[ERROR] Cannot connect to http://localhost:8000/api/v1/data/simple
Make sure the backend is running: uvicorn backend.api.main:app
Use Ctrl+C to gracefully stop the generator. It will display a summary of events sent and failed.
Advanced Usage
Programmatic Integration
You can import and use the generator functions in your own scripts:
import sys
import os
sys.path.insert( 0 , os.path.dirname(os.path.dirname(os.path.abspath( __file__ ))))
from scripts.generate_data import generate_healthy_reading, send_event
# Generate a single healthy reading
data = generate_healthy_reading()
print (data)
# {'voltage_v': 230.5, 'current_a': 15.2, 'power_factor': 0.92, 'vibration_g': 0.15}
# Send to API
send_event( asset_id = "motor_01" , sensor_data = data, is_faulty = False )
Troubleshooting
HTTP 503 Service Unavailable
Cause: Backend is starting up (Render free-tier cold start).
Solution: Wait 30-60 seconds and retry. The frontend’s /ping heartbeat prevents this in production.
HTTP 422 Unprocessable Entity
Cause: Invalid payload schema.
Solution: Ensure all required fields are present: asset_id, voltage_v, current_a, power_factor, vibration_g.
Data Not Appearing in Dashboard
Check InfluxDB connection: Visit /health endpoint
Verify bucket name: Ensure INFLUX_BUCKET=sensor_data matches your configuration
Inspect API logs: Look for ingestion errors in the backend console
Next Steps
Calibration Workflow Use generated data to calibrate baseline models
API Reference Learn about the /ingest endpoint schema