Skip to main content

run_cpu_inference

def run_cpu_inference(
    model,
    X: pd.DataFrame
) -> dict[str, float]
Executes inference on CPU and returns performance metrics along with prediction statistics.
model
sklearn-compatible model
required
Trained model with predict_proba method. Should support CPU-based inference
X
pd.DataFrame
required
Input features DataFrame for inference
return
dict[str, float]
Dictionary containing inference metrics:
  • inference_latency_ms: Time taken for inference in milliseconds
  • output_mean_probability: Mean of predicted probabilities (for class 1)
  • output_std_probability: Standard deviation of predicted probabilities

Example

from deployment.cpu_inference import run_cpu_inference
import pandas as pd

X_test = pd.DataFrame(...)  # Patient features
metrics = run_cpu_inference(model=trained_model, X=X_test)

print(metrics)
# Output:
# {
#     'inference_latency_ms': 45.23,
#     'output_mean_probability': 0.387,
#     'output_std_probability': 0.215
# }

Use Cases

  • Performance Benchmarking: Measure inference latency for deployment planning
  • Model Monitoring: Track prediction distribution through mean and standard deviation
  • CPU Deployment: Optimize and validate CPU-based inference performance

Build docs developers (and LLMs) love