Calibration is the process of training the Isolation Forest models on healthy data only to establish a baseline for “normal” asset behavior. This happens once during system commissioning via the /system/calibrate endpoint.
def _filter_healthy_data(self, data): """ Filter to healthy data only (is_fault_injected == False). If is_fault_injected column doesn't exist, assumes all data is healthy. """ if 'is_fault_injected' in data.columns: return data[data['is_fault_injected'] == False].copy() return data.copy()
Criteria:
is_fault_injected == False (fault injection flag OFF)
No operator-logged maintenance events during window
Sensor values within expected ranges (validated by baseline builder)
Never train on faulty data!If the training window includes faults, the model will learn to accept failure modes as “normal,” rendering anomaly detection useless.
# From baseline.py:26MIN_COVERAGE_RATIO = 0.80 # 80% of samples must be valid (non-NaN)# From detector.py:209if base_features.shape[0] < 10: raise ValueError(f"Insufficient data for training: {base_features.shape[0]} samples (need >= 10)")# From batch_detector.py:128if len(feature_rows) < 10: raise ValueError(f"Need >= 10 training windows, got {len(feature_rows)}")
Minimum thresholds:
10 samples minimum (hard floor)
80% coverage (max 20% NaN allowed)
60 minutes recommended (ensures 1-hour rolling windows are fully populated)
If more than 20% of data is NaN (missing sensors, cold-start windows, etc.), the statistical profile becomes unreliable. The system fails fast rather than training on bad data.
# From baseline.py:29-35class SignalProfile(BaseModel): """Statistical profile for a single signal or feature.""" mean: float = Field(..., description="Mean value from healthy data") std: float = Field(..., ge=0, description="Standard deviation") min: float = Field(..., description="Minimum observed value (descriptive)") max: float = Field(..., description="Maximum observed value (descriptive)") sample_count: int = Field(..., ge=0, description="Number of valid samples used")
# From detector.py:239-245training_decisions = self._model.decision_function(features_scaled)# Decision function: higher = more normal# We want the 99th percentile of healthy data as our threshold# Invert the sign because we'll invert later for anomaly scoresself._threshold_score = float(np.percentile(-training_decisions, 99))
Why 99th percentile?
Even healthy data has outliers (1% are “unusual but not faulty”)
This threshold marks the boundary: scores above it are true anomalies
Used in calibrated scoring formula (see _calibrated_score in detector.py:362-397)
Calibrated Scoring Formula
# From detector.py:362-397def _calibrated_score(self, decision_value: float) -> float: # Invert decision value (higher decision = more normal, so negate) raw_score = -decision_value # Calibrate against training threshold # threshold_score is the 99th percentile of healthy (-decision) values calibration_factor = self._threshold_score * 1.5 if calibration_factor > 0: calibrated = raw_score / calibration_factor else: # Fallback if threshold is 0 (shouldn't happen) calibrated = raw_score + 0.5 # Clip to [0, 1] return float(np.clip(calibrated, 0.0, 1.0))
# From batch_detector.py:148-153self._healthy_means = { col: float(feature_matrix[col].mean()) for col in BATCH_FEATURE_NAMES}self._healthy_stds = { col: float(feature_matrix[col].std()) for col in BATCH_FEATURE_NAMES}
These are used later for z-score explainability:
# From batch_detector.py:257zscore = (val - h_mean) / h_std
Example output:
"vibration_g_std: 0.17 vs healthy 0.02 (30.0σ above normal)"