Benchmarking Overview

The EVM Vital Signs Monitor includes a sophisticated benchmarking system designed to measure both performance and accuracy across different hardware platforms and face detection models.

What the Benchmarking System Measures

The benchmarking framework evaluates three critical aspects of the vital signs monitoring pipeline:

1. ROI Detection Performance

Measures the face detection component’s speed and reliability:

Detection time per frame
Detection success rate
ROI stability (jitter and size variance)
Consecutive failure tracking
Real-time FPS capability

2. EVM Processing Performance

Evaluates the Eulerian Video Magnification processing efficiency:

Chunk processing time
Time per frame within chunks
Processing throughput (EVM FPS)
Buffer size impact on performance

3. Heart Rate Accuracy

Compares predictions against ground truth from the UBFC-RPPG dataset:

Mean Absolute Error (MAE)
Root Mean Square Error (RMSE)
Correlation coefficients
Accuracy within thresholds (5, 10, 15 BPM)
Error distribution statistics

Three Benchmark Types

The system provides three specialized benchmarking scripts, each focusing on different aspects of the pipeline:

ROI Benchmark

Tests only the face detection component in isolation

EVM Benchmark

Tests EVM processing with pre-detected ROIs

Complete Benchmark

Tests the entire end-to-end pipeline

ROI-Only Benchmark (`advance_run_ROI.py`)

Focuses exclusively on face detection performance across different models:

Supports: Haar Cascade, MediaPipe, MTCNN, YOLOv11
Measures detection speed, stability, and reliability
No EVM processing overhead
Ideal for comparing detector models

EVM-Only Benchmark (`advance_run_EVM.py`)

Isolates EVM processing performance:

Assumes successful ROI detection
Measures pure EVM computation time
Evaluates heart rate prediction accuracy
Tests different buffer sizes

Complete Benchmark (`advance_run_complete.py`)

Provides comprehensive end-to-end metrics:

Measures full pipeline from video input to HR output
Tracks both ROI detection and EVM processing
Calculates resource usage (CPU, memory, temperature)
Provides time breakdown percentages
Most realistic performance assessment

The CompleteMetrics Class

At the core of the benchmarking system is the CompleteMetrics class (source/Python/experiments/advance_run_complete.py:27), which aggregates metrics across all pipeline stages:

class CompleteMetrics:
    """Combined metrics for ROI, EVM, and system resources"""
    def __init__(self):
        # Stage timing metrics
        self.detection_times = []      # ROI detection time per frame
        self.processing_times = []     # EVM processing time per chunk
        self.end_to_end_times = []     # Total time per chunk
        
        # ROI metrics
        self.successful_roi_detections = 0
        self.roi_positions = []
        self.roi_sizes = []
        
        # HR accuracy metrics
        self.hr_errors = []
        self.hr_absolute_errors = []
        self.hr_predictions = []
        self.hr_ground_truths = []
        
        # System resource metrics (critical for Raspberry Pi 4)
        self.cpu_usage_samples = []
        self.memory_usage_samples = []
        self.temperature_samples = []

This unified approach allows for:

Granular performance analysis - Separate metrics for each pipeline stage
Quality assessment - Stability and reliability measurements
Accuracy tracking - Multiple error metrics and statistical measures
Resource monitoring - Essential for embedded deployment

Dataset and Ground Truth

All benchmarks use the UBFC-RPPG Dataset 2, which provides:

42 subjects with synchronized video and pulse oximeter data
CMS50E pulse oximeter ground truth (clinical-grade accuracy)
30 FPS video recordings under controlled lighting
Frame-by-frame heart rate measurements

Learn More About the Dataset

Detailed information about UBFC-RPPG Dataset 2 structure and usage

Platform Comparison

The benchmarking system is designed to evaluate performance across different hardware:

Desktop PC Results

Located in results/results_pc/Typical performance:

ROI Detection: ~200 FPS (MediaPipe)
EVM Processing: ~320 FPS per frame
End-to-End: ~124 FPS
MAE: ~13.3 BPM

Raspberry Pi 4 Results

Located in results/results_rp4/Includes temperature and throttling monitoring:

CPU frequency tracking
Temperature alerts (>75°C warning, >80°C critical)
Throttling detection
Memory pressure monitoring

Why Multiple Benchmark Types?

Each benchmark serves a specific purpose:

ROI Benchmark - When selecting the optimal face detector for your hardware
EVM Benchmark - When tuning buffer sizes and EVM parameters
Complete Benchmark - When validating full system performance for deployment

By separating concerns, you can:

Identify bottlenecks in specific pipeline stages
Optimize components independently
Make data-driven decisions about model selection
Validate performance on target hardware before deployment

Next Steps

Run Benchmarks

Learn how to execute benchmarks on your system

Metrics Reference

Detailed explanation of all collected metrics

Dataset Info

Understand the UBFC-RPPG dataset structure

Get Started

Core Concepts

Guides

Benchmarking

What the Benchmarking System Measures

1. ROI Detection Performance

2. EVM Processing Performance

3. Heart Rate Accuracy

Three Benchmark Types

ROI Benchmark

EVM Benchmark

Complete Benchmark

ROI-Only Benchmark (`advance_run_ROI.py`)

EVM-Only Benchmark (`advance_run_EVM.py`)

Complete Benchmark (`advance_run_complete.py`)

The CompleteMetrics Class

Dataset and Ground Truth

Learn More About the Dataset

Platform Comparison

Why Multiple Benchmark Types?

Next Steps

Run Benchmarks

Metrics Reference

Dataset Info

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Benchmarking

​What the Benchmarking System Measures

​1. ROI Detection Performance

​2. EVM Processing Performance

​3. Heart Rate Accuracy

​Three Benchmark Types

ROI Benchmark

EVM Benchmark

Complete Benchmark

​ROI-Only Benchmark (advance_run_ROI.py)

​EVM-Only Benchmark (advance_run_EVM.py)

​Complete Benchmark (advance_run_complete.py)

​The CompleteMetrics Class

​Dataset and Ground Truth

Learn More About the Dataset

​Platform Comparison

​Why Multiple Benchmark Types?

​Next Steps

Run Benchmarks

Metrics Reference

Dataset Info

Build docs developers (and LLMs) love

What the Benchmarking System Measures

1. ROI Detection Performance

2. EVM Processing Performance

3. Heart Rate Accuracy

Three Benchmark Types

ROI-Only Benchmark (`advance_run_ROI.py`)

EVM-Only Benchmark (`advance_run_EVM.py`)

Complete Benchmark (`advance_run_complete.py`)

The CompleteMetrics Class

Dataset and Ground Truth

Platform Comparison

Why Multiple Benchmark Types?

Next Steps