The system runs two Isolation Forest models trained on different feature representations of the same sensor data. Both models are trained during calibration, but the batch model is primary for inference.
# From batch_features.py:30-48# Signals we extract batch features fromSIGNAL_COLUMNS = ["voltage_v", "current_a", "power_factor", "vibration_g"]# Statistics extracted per signalSTAT_NAMES = ["mean", "std", "peak_to_peak", "rms"]def get_batch_feature_names() -> List[str]: """Return the ordered list of all batch feature column names.""" names = [] for signal in SIGNAL_COLUMNS: for stat in STAT_NAMES: names.append(f"{signal}_{stat}") return namesBATCH_FEATURE_NAMES: List[str] = get_batch_feature_names()BATCH_FEATURE_COUNT: int = len(BATCH_FEATURE_NAMES) # 16
# Why this matters:# A "Jitter Fault" can have a NORMAL mean vibration (0.15g) but an# ABNORMAL variance (0.08 instead of 0.02). The old 1Hz average model# would completely miss this. The batch feature model catches it.
Statistical Feature Extraction (100:1 Reduction)
# From batch_features.py:72-94for signal in SIGNAL_COLUMNS: # Extract signal values as NumPy array (vectorized) values = np.array( [p.get(signal, 0.0) for p in raw_points], dtype=np.float64, ) # Mean mean_val = float(np.mean(values)) features[f"{signal}_mean"] = mean_val # Standard Deviation (ddof=0 for population std, consistent with training) std_val = float(np.std(values, ddof=0)) features[f"{signal}_std"] = std_val # Peak-to-Peak (Max - Min) p2p_val = float(np.max(values) - np.min(values)) features[f"{signal}_peak_to_peak"] = p2p_val # RMS (Root Mean Square) rms_val = float(np.sqrt(np.mean(values ** 2))) features[f"{signal}_rms"] = rms_val
Performance: ~0.05ms per 100-point window
# From batch_features.py:62-64# Performance:# ~0.05ms for 100 points on a single core. Pure NumPy — no Python loops# over data points.
Extremely fast vectorized NumPy operations enable real-time processing.
Why it matters: A “Jitter Fault” where average vibration is 0.15g (normal) but σ=0.17g (5x healthy) is invisible to 1Hz models. The batch model catches it because std and peak_to_peak are explicit features.
vibration_g_mean = 0.15 # Still normal (legacy model sees this)vibration_g_std = 0.17 # 8.5× higher! (batch model catches this)vibration_g_peak_to_peak = 0.55 # Large transients
The legacy model only sees vibration_intensity_rms (a 1-hour rolling window), which smooths out the high-frequency jitter.
# From system_routes.py (simplified)batch_score = batch_detector.score_batch(batch_features)# Legacy model is fallback (or for comparison)legacy_score = detector.score_single(legacy_features)
The batch model’s superior F1 score (99.6% vs 78.1%) makes it the primary decision-maker for health assessment.