Introduction
The EVM Vital Signs Monitor is a contactless vital signs monitoring system that uses Eulerian Video Magnification (EVM) to extract heart rate (HR) and respiratory rate (RR) from standard video input. The system processes video in real-time, detecting subtle color changes in the face region that correspond to physiological signals.The system achieves 50-60% faster performance than traditional dual-pass EVM implementations through a single-pass dual-band processing optimization.
System Workflow
The complete processing pipeline consists of five main stages:Processing Stages
Stage 1: Face Detection & ROI Extraction
Stage 1: Face Detection & ROI Extraction
The system first detects faces in the video stream and extracts a stable Region of Interest (ROI):
- Multiple detector backends: Haar Cascade, MTCNN, YOLO, MediaPipe
- Temporal stabilization: Weighted averaging over last 5 frames reduces jitter
- ROI weights:
[0.1, 0.15, 0.2, 0.25, 0.3](recent frames weighted higher) - Adaptive boundaries: Ensures ROI stays within frame limits
src/face_detector/manager.py:95-130Stage 2: Pyramid Decomposition
Stage 2: Pyramid Decomposition
Each ROI frame is decomposed into a Laplacian pyramid for multi-resolution analysis:
- Gaussian pyramid: Successive downsampling creates multiple resolution levels
- Laplacian pyramid: Difference between Gaussian levels captures spatial frequencies
- Default levels: 3 levels optimized for Raspberry Pi performance
- Dual-band optimization: Single pyramid construction supports both HR and RR extraction
- Level 3: Higher spatial frequency (optimal for heart rate)
- Level 2: Lower spatial frequency (optimal for respiration)
src/evm/pyramid_processing.py:98-126Stage 3: Temporal Filtering
Stage 3: Temporal Filtering
Separate bandpass filters isolate physiological frequency bands:Heart Rate Band (0.8-3 Hz):
- Frequency range: 48-180 BPM
- Applied to pyramid level 3
- Butterworth bandpass filter (order 2)
- Frequency range: 12-48 breaths per minute
- Applied to pyramid level 2
- Independent filtering allows simultaneous extraction
src/evm/temporal_filtering.py:30-60Stage 4: Signal Amplification
Stage 4: Signal Amplification
Filtered signals are amplified to enhance subtle variations:
- Heart rate amplification: α = 30
- Respiratory rate amplification: α = 50
- Spatial averaging: Green channel for HR, all channels for RR
src/evm/evm_core.py:98-108Stage 5: Frequency Analysis
Stage 5: Frequency Analysis
FFT-based spectral analysis extracts dominant frequencies:
- Signal preprocessing: Detrending and normalization
- Hamming window: Reduces spectral leakage
- FFT computation: Converts to frequency domain
- Peak detection: Finds dominant frequency in physiological range
- Validation: Ensures result falls within plausible bounds
src/evm/signal_analysis.py:127-158Key Architectural Components
Core Modules
| Module | Location | Responsibility |
|---|---|---|
| EVMProcessor | src/evm/evm_core.py | Orchestrates dual-band EVM processing |
| FaceDetector | src/face_detector/manager.py | Manages face detection and ROI stabilization |
| Pyramid Processing | src/evm/pyramid_processing.py | Builds and manipulates Laplacian pyramids |
| Temporal Filtering | src/evm/temporal_filtering.py | Applies bandpass filters to isolate frequencies |
| Signal Analysis | src/evm/signal_analysis.py | FFT-based frequency extraction |
| Configuration | src/config.py | Central parameter definitions |
Data Flow
Video Frame Acquisition
Input frames are captured at 30 FPS (configurable) and resized to 320x240 for processing efficiency.
ROI Extraction
Face detector identifies face region and applies temporal smoothing to maintain stable ROI across frames.
Single-Pass Pyramid Construction
One Laplacian pyramid stack is built for the entire video buffer (typically 30+ frames).
Dual-Band Processing
Two pyramid levels are extracted and filtered independently:
- Level 3 tensor → HR bandpass → α=30 amplification → Green channel signal
- Level 2 tensor → RR bandpass → α=50 amplification → All-channel signal
Performance Optimization
Key Innovation: The dual-band processing approach builds Laplacian pyramids only once, then applies different temporal filters to different pyramid levels. This eliminates the need for separate pyramid construction for HR and RR, achieving 50-60% performance improvement.
Traditional vs. Optimized Approach
Traditional (Two-Pass):src/evm/evm_core.py:40-109
Configuration Parameters
Key system parameters are defined insrc/config.py:1-33:
Next Steps
Eulerian Video Magnification
Deep dive into the EVM technique and dual-band implementation
Signal Processing
Learn about temporal filtering and FFT-based frequency analysis
Face Detection
Explore ROI extraction and stabilization techniques
API Reference
Browse the complete API documentation