Anomaly Detection API

OutlierDetector

Z-score based anomaly detector that identifies outliers using standardized distances from the mean.

Constructor

OutlierDetector(random_state: int = 42)

random_state

int

default:"42"

Random seed for reproducibility (currently unused but provided for consistency)

Attributes

center

np.ndarray | None

Mean values for each feature, computed during fit

scale

np.ndarray | None

Standard deviations for each feature (with 1e-6 added for numerical stability), computed during fit

Methods

fit

Fit the detector by computing feature means and standard deviations.

fit(X: pd.DataFrame) -> OutlierDetector

pd.DataFrame

required

Training data with samples as rows and features as columns

return

OutlierDetector

Returns self for method chaining

score_samples

Compute anomaly scores for samples.

score_samples(X: pd.DataFrame) -> pd.Series

pd.DataFrame

required

Data to score. Must have same features as training data.

return

pd.Series

Series of anomaly scores (mean absolute z-score across features). Higher scores indicate more anomalous samples.

detect

Detect anomalies based on quantile threshold.

detect(X: pd.DataFrame, threshold_quantile: float = 0.9) -> pd.DataFrame

pd.DataFrame

required

Data to analyze for anomalies

threshold_quantile

float

default:"0.9"

Quantile threshold for anomaly detection. Scores at or above this quantile are flagged as anomalies.

return

pd.DataFrame

DataFrame with columns:

anomaly_score: Numeric anomaly scores (float)
is_anomaly: Boolean flag indicating if score exceeds threshold (bool)

simulate_early_warning

Simulate an early warning system by counting alerts and measuring latency.

simulate_early_warning(
    scores: pd.Series,
    timestamps: pd.DatetimeIndex,
    threshold: float
) -> dict[str, float]

scores

pd.Series

required

Time-ordered anomaly scores

timestamps

pd.DatetimeIndex

required

Timestamps corresponding to each score (must be same length as scores)

threshold

float

required

Score threshold for triggering alerts

return

dict[str, float]

Dictionary containing:

alert_count: Total number of alerts triggered (float)
first_alert_latency_s: Seconds from start to first alert (float, inf if no alerts)

evaluate_detection_latency

Evaluate detection latency by measuring time from ground truth event to first alert.

evaluate_detection_latency(
    scores: pd.Series,
    ground_truth_events: pd.Series,
    timestamps: pd.DatetimeIndex
) -> float

scores

pd.Series

required

Anomaly scores for each time point

ground_truth_events

pd.Series

required

Binary series (0 or 1) indicating when true events occurred

timestamps

pd.DatetimeIndex

required

Timestamps for each observation (must be same length as scores and events)

return

float

Latency in seconds from first ground truth event to first alert. Returns:

Positive float: Seconds between first event and first subsequent alert
nan: No ground truth events found
inf: No alerts triggered after the first event

CLI Commands

Data Modules

Models

Real-time

Deployment

Evaluation

Utilities

OutlierDetector

Constructor

Attributes

Methods

fit

score_samples

detect

simulate_early_warning

evaluate_detection_latency

Build docs developers (and LLMs) love

CLI Commands

Data Modules

Models

Real-time

Deployment

Evaluation

Utilities

​OutlierDetector

​Constructor

​Attributes

​Methods

​fit

​score_samples

​detect

​simulate_early_warning

​evaluate_detection_latency

Build docs developers (and LLMs) love

OutlierDetector

Constructor

Attributes

Methods

fit

score_samples

detect

simulate_early_warning

evaluate_detection_latency