Skip to main content
The benchmarking system collects a wide range of metrics across different aspects of the vital signs monitoring pipeline. This reference explains each metric and how to interpret the results.

ROI Performance Metrics

These metrics measure the face detection component’s speed and efficiency.

Detection Time Metrics

avg_detection_time_ms
float
Average time to detect a face in a single frame, measured in milliseconds.Typical values:
  • MediaPipe: 5-10 ms
  • Haar Cascade: 2-5 ms
  • MTCNN: 15-30 ms
  • YOLOv11: 8-15 ms
Lower is better.
std_detection_time_ms
float
Standard deviation of detection times. Measures consistency.Lower values indicate more predictable performance.
median_detection_time_ms
float
Median detection time. More robust to outliers than average.
max_detection_time_ms
float
Worst-case detection time. Important for real-time requirements.
min_detection_time_ms
float
Best-case detection time.

Throughput Metrics

detection_fps
float
Frames per second that can be processed by the face detector.Calculated as: 1000 / avg_detection_time_msTypical values:
  • Desktop: 100-400 FPS
  • Raspberry Pi 4: 20-80 FPS
Higher is better.
total_frames
int
Total number of frames processed during the benchmark.

EVM Performance Metrics

These metrics measure the Eulerian Video Magnification processing efficiency.

Chunk Processing Metrics

total_chunks
int
Number of frame chunks processed.Each chunk contains BUFFER_SIZE frames (typically 200).
avg_chunk_time_s
float
Average time to process one chunk of frames, in seconds.Typical values:
  • Desktop: 0.5-1.5 seconds
  • Raspberry Pi 4: 2-5 seconds
This is the time for the EVM algorithm to process all frames in the buffer.
std_chunk_time_s
float
Standard deviation of chunk processing times.
median_chunk_time_s
float
Median chunk processing time.

Per-Frame EVM Metrics

avg_time_per_frame_ms
float
Average EVM processing time per frame within a chunk.Calculated as: (avg_chunk_time_s * 1000) / buffer_sizeTypical values:
  • Desktop: 2-7 ms per frame
  • Raspberry Pi 4: 10-25 ms per frame
evm_fps
float
Equivalent FPS if processing frames sequentially through EVM.Calculated as: 1000 / avg_time_per_frame_msNote: This is theoretical - EVM processes chunks, not individual frames.

End-to-End Performance Metrics

These metrics measure the complete pipeline from video input to heart rate output.
end_to_end_fps
float
Overall system throughput in frames per second.Includes both ROI detection and EVM processing time.Typical values:
  • Desktop (MediaPipe): 100-150 FPS
  • Desktop (Haar): 150-250 FPS
  • Raspberry Pi 4: 15-40 FPS
detection_time_percentage
float
Percentage of total time spent on ROI detection.Typical values: 40-60%Higher percentages indicate ROI detection is the bottleneck.
evm_time_percentage
float
Percentage of total time spent on EVM processing.Typical values: 40-60%Complement of detection_time_percentage.

Example Time Breakdown

{
  "end_to_end_performance": {
    "avg_chunk_time_s": 1.777,
    "end_to_end_fps": 123.7,
    "detection_time_percentage": 58.9,
    "evm_time_percentage": 41.1
  }
}
Interpretation: ROI detection takes ~59% of processing time, making it the primary bottleneck.

ROI Quality Metrics

These metrics assess the reliability and stability of face detection.

Detection Reliability

detection_rate_percent
float
Percentage of frames where a face was successfully detected.Target: >95% for reliable operationLower values indicate the detector struggles with the video conditions.
successful_detections
int
Number of frames with successful face detection.
failed_detections
int
Number of frames where face detection failed.

Stability Metrics

jitter_avg_pixels
float
Average spatial movement of ROI center between consecutive frames.Measured in pixels. Lower is better.Typical values:
  • Stable detectors: 0-2 pixels
  • Jittery detectors: 5+ pixels
High jitter can affect EVM quality.
roi_width_std
float
Standard deviation of ROI width across frames.Lower values indicate more consistent sizing.
roi_height_std
float
Standard deviation of ROI height across frames.

Failure Pattern Metrics

max_consecutive_failures
int
Maximum number of consecutive frames where detection failed.Concern threshold: >10 framesLong failure sequences interrupt measurements.
avg_consecutive_failures
float
Average length of failure sequences.
num_failure_episodes
int
Number of separate failure sequences.Many short failures vs. few long failures indicate different issues.

Heart Rate Accuracy Metrics

These metrics compare predicted heart rates against ground truth from the pulse oximeter.

Error Metrics

mae_bpm
float
Mean Absolute Error in beats per minute.Average of |predicted_hr - true_hr| across all measurements.Performance levels:
  • Excellent: Less than 5 BPM
  • Good: 5-10 BPM
  • Acceptable: 10-15 BPM
  • Poor: Greater than 15 BPM
me_bpm
float
Mean Error (signed) in BPM.Shows systematic bias:
  • Positive: Tends to over-predict
  • Negative: Tends to under-predict
  • Near zero: Unbiased
rmse_bpm
float
Root Mean Square Error in BPM.Calculated as: sqrt(mean(errors^2))Penalizes large errors more than MAE.
std_error_bpm
float
Standard deviation of errors.Measures prediction consistency.
max_error_bpm
float
Maximum absolute error encountered.
min_error_bpm
float
Minimum absolute error encountered.

Accuracy Within Thresholds

within_5bpm_percent
float
Percentage of predictions within 5 BPM of ground truth.Target: >70% for clinical applications
within_10bpm_percent
float
Percentage of predictions within 10 BPM of ground truth.Target: >85%
within_15bpm_percent
float
Percentage of predictions within 15 BPM of ground truth.Target: >95%

Statistical Metrics

correlation
float
Pearson correlation coefficient between predictions and ground truth.Range: -1 to 1
  • 1.0: Perfect positive correlation
  • 0.0: No correlation
  • -1.0: Perfect negative correlation
Target: >0.8
hr_pred_min
float
Minimum predicted heart rate value.
hr_pred_max
float
Maximum predicted heart rate value.
hr_true_min
float
Minimum ground truth heart rate value.
hr_true_max
float
Maximum ground truth heart rate value.

System Resource Metrics

These metrics monitor hardware utilization, critical for embedded deployment on Raspberry Pi 4.

CPU Metrics

cpu_usage_avg_percent
float
Average CPU utilization during processing.Raspberry Pi 4: 40-80% typical
cpu_usage_max_percent
float
Peak CPU utilization.
cpu_freq_avg_mhz
float
Average CPU frequency in MHz.Raspberry Pi 4: 1500 MHz (normal), less than 1500 MHz indicates throttling
cpu_freq_min_mhz
float
Minimum CPU frequency observed.Significantly below max indicates throttling occurred.
potential_throttling
boolean
True if CPU frequency variance suggests throttling.Triggered when cpu_freq_std_mhz > 100

Memory Metrics

memory_usage_avg_percent
float
Average memory utilization.
memory_usage_max_percent
float
Peak memory utilization.Watch for: >85% on Raspberry Pi 4

Temperature Metrics

temperature_avg_celsius
float
Average CPU temperature in Celsius.Raspberry Pi 4:
  • Normal: Below 70°C
  • Warm: 70-75°C
  • Hot: 75-80°C
  • Throttling: Above 80°C
temperature_max_celsius
float
Peak temperature observed.
temperature_warning
boolean
True if temperature exceeded 75°C.
temperature_critical
boolean
True if temperature exceeded 80°C.At this point, Raspberry Pi 4 throttles CPU to protect hardware.

Example Results Structure

Here’s a complete example of metrics from a single video:
{
  "subject_id": "subject1",
  "subject_num": 1,
  "video_path": "/dataset/subject1/vid.mp4",
  "video_fps": 30.0,
  "buffer_size": 200,
  "model_type": "mediapipe",
  
  "roi_performance": {
    "total_frames": 3240,
    "avg_detection_time_ms": 5.15,
    "std_detection_time_ms": 0.82,
    "median_detection_time_ms": 5.02,
    "max_detection_time_ms": 12.3,
    "min_detection_time_ms": 3.8,
    "detection_fps": 194.2
  },
  
  "evm_performance": {
    "total_chunks": 16,
    "avg_chunk_time_s": 0.787,
    "std_chunk_time_s": 0.053,
    "median_chunk_time_s": 0.781,
    "max_chunk_time_s": 0.892,
    "min_chunk_time_s": 0.723,
    "avg_time_per_frame_ms": 3.93,
    "evm_fps": 254.1
  },
  
  "end_to_end_performance": {
    "avg_chunk_time_s": 1.774,
    "end_to_end_fps": 112.6,
    "detection_time_percentage": 58.2,
    "evm_time_percentage": 41.8
  },
  
  "roi_quality": {
    "detection_rate_percent": 100.0,
    "successful_detections": 3240,
    "failed_detections": 0,
    "jitter_avg_pixels": 0.006,
    "roi_width_std": 0.47,
    "roi_height_std": 0.47,
    "max_consecutive_failures": 0,
    "avg_consecutive_failures": 0,
    "num_failure_episodes": 0
  },
  
  "hr_accuracy": {
    "num_measurements": 16,
    "mae_bpm": 8.42,
    "me_bpm": -3.21,
    "rmse_bpm": 10.15,
    "std_error_bpm": 6.73,
    "max_error_bpm": 18.5,
    "min_error_bpm": 1.2,
    "within_5bpm_percent": 56.25,
    "within_10bpm_percent": 75.0,
    "within_15bpm_percent": 93.75,
    "correlation": 0.847,
    "hr_pred_min": 62.3,
    "hr_pred_max": 89.7,
    "hr_true_min": 64.8,
    "hr_true_max": 92.1
  },
  
  "system_resources": {
    "cpu_usage_avg_percent": 1.42,
    "cpu_usage_max_percent": 15.8,
    "memory_usage_avg_percent": 59.3,
    "memory_usage_max_percent": 60.2,
    "cpu_freq_avg_mhz": 2496.0,
    "cpu_freq_min_mhz": 2496.0,
    "cpu_freq_max_mhz": 2496.0,
    "potential_throttling": false,
    "temperature_avg_celsius": 42.5,
    "temperature_max_celsius": 46.8,
    "temperature_warning": false,
    "temperature_critical": false
  }
}

Interpreting Results

  • detection_rate_percent > 95%
  • mae_bpm < 10
  • within_10bpm_percent > 85%
  • correlation > 0.8
  • jitter_avg_pixels < 2.0
  • No temperature warnings
  • detection_time_percentage > 70% - ROI detection is bottleneck, try faster detector
  • evm_time_percentage > 70% - EVM processing is bottleneck, reduce buffer size
  • potential_throttling = true - CPU overheating, improve cooling
  • High mae_bpm but low jitter - EVM parameters may need tuning
  • High mae_bpm and high jitter - ROI stability is affecting accuracy
  • Negative me_bpm - Systematic under-prediction, check frequency range
  • Positive me_bpm - Systematic over-prediction, check noise filtering

Next Steps

Learn About the Dataset

Understand the UBFC-RPPG dataset structure and ground truth data

Build docs developers (and LLMs) love