Overview
The metrics module provides comprehensive performance measurement capabilities for edge AI models, including accuracy evaluation, latency benchmarking, memory profiling, and energy estimation.Data Classes
PerfMetrics
Dataclass that encapsulates all performance metrics for a model configuration.Model accuracy on the evaluation dataset (range: 0.0 to 1.0)
Mean inference latency in milliseconds
Standard deviation of latency measurements in milliseconds
95th percentile latency in milliseconds
Throughput in samples per second
Model memory footprint in megabytes
Estimated energy consumption in joules (latency × power)
Core Functions
collect_metrics
Collects comprehensive performance metrics for a PyTorch model.PyTorch model to benchmark
DataLoader for evaluation dataset
Device to run inference on (CPU or CUDA)
Power consumption of the target hardware in watts (used for energy estimation)
Numeric precision for inference. Supported values:
"fp32", "fp16"Multiplier to adjust latency measurements for hardware differences
Number of times to repeat latency measurements for statistical significance
Complete performance metrics including accuracy, latency statistics, throughput, memory, and energy
The function automatically handles precision conversion for FP16 mode and includes warmup runs before latency measurement.
memory_violations
Checks if model memory footprint violates specified budget constraints.Model memory footprint in megabytes
List of memory budget thresholds to check against
Dictionary with keys in format
"violates_{budget}mb" and boolean values indicating violationsUtility Functions
evaluate_accuracy
Evaluates model accuracy on a dataset.Model to evaluate
DataLoader containing evaluation data
Device for inference
Numeric precision:
"fp32" or "fp16"Accuracy as a float between 0.0 and 1.0
measure_latency
Measures average inference latency for a single input.Model to benchmark
Sample input tensor for inference
Number of inference runs to average
Number of warmup runs before measurement
Average latency in milliseconds
measure_latency_distribution
Measures latency statistics including mean, standard deviation, and 95th percentile.Model to benchmark
Sample input tensor
Number of measurement repetitions for statistical analysis
Number of runs per repetition
Warmup runs before each measurement
Tuple of (mean_latency_ms, std_latency_ms, p95_latency_ms)
model_memory_mb
Calculates model memory footprint from state dict.PyTorch model to analyze
Memory footprint in megabytes
Memory calculation includes all model parameters (weights and biases) based on their actual tensor sizes and data types.