Eulerian Video Magnification

What is Eulerian Video Magnification?

Eulerian Video Magnification (EVM) is a computational technique that reveals temporal variations in videos that are impossible to see with the naked eye. Unlike Lagrangian methods that track individual pixels, EVM analyzes temporal variations at fixed spatial positions (Eulerian perspective).

EVM can amplify color changes as small as 0.1% of the original intensity, making subtle physiological signals visible.

Key Principles

Spatial Decomposition

Uses Laplacian pyramids to separate different spatial frequency bands

Temporal Filtering

Isolates specific frequency ranges corresponding to physiological processes

Signal Amplification

Multiplies filtered signals by amplification factor (α) to enhance visibility

Reconstruction

Combines amplified signals back into video or extracts temporal signatures

EVM for Vital Signs Monitoring

In the context of vital signs monitoring, EVM exploits two physiological phenomena:

Cardiac pulse: Blood volume changes cause subtle color variations in skin (photoplethysmography)
Respiration: Chest and facial movements create subtle motion and color changes

Frequency Characteristics

Signal	Frequency Range (Hz)	Frequency Range (BPM)	Typical Value
Heart Rate	0.8 - 3.0	48 - 180	60-100 BPM
Respiratory Rate	0.2 - 0.8	12 - 48	12-20 breaths/min

These non-overlapping frequency ranges allow simultaneous extraction of both vital signs from the same video stream.

Dual-Band Processing Architecture

The EVM Vital Signs Monitor implements an optimized dual-band processing approach that extracts both HR and RR in a single pass through the video data.

Single-Pass Optimization

class EVMProcessor:
    """
    Processor for Eulerian Video Magnification (EVM) with optimized configuration.
    
    Performs single-pass dual-band processing:
    - Builds Laplacian pyramids once
    - Applies two separate temporal filters (HR and RR frequency bands)
    - Amplifies each band with its corresponding alpha factor
    - Extracts both heart rate (HR) and respiratory rate (RR) temporal signals
    """
    
    def __init__(self, levels=LEVELS_RPI, alpha_hr=ALPHA_HR, alpha_rr=ALPHA_RR):
        self.levels = levels
        self.alpha_hr = alpha_hr  # Default: 30
        self.alpha_rr = alpha_rr  # Default: 50

Source: src/evm/evm_core.py:16-38

Processing Pipeline

The process_dual_band() method implements the complete EVM pipeline:

Pyramid Construction

Build Laplacian pyramid stack from all video frames once:

# STEP 1: Build Laplacian pyramids (SINGLE PASS)
laplacian_pyramids = build_video_pyramid_stack(
    video_frames, levels=self.levels
)

Source: src/evm/evm_core.py:74-77

Level Selection

Select optimal pyramid levels for each signal type:

# STEP 2: Select optimal pyramid level for each signal
level_hr = min(3, num_levels - 1)  # HR: level 3 (higher spatial freq)
level_rr = min(2, num_levels - 1)  # RR: level 2 (lower spatial freq)

Why different levels?

Higher pyramid levels (smaller spatial scale) better capture rapid color changes from pulse
Lower pyramid levels (larger spatial scale) better capture slower motion from breathing

Source: src/evm/evm_core.py:82-83

Tensor Extraction

Extract 4D tensors (Time × Height × Width × Channels) from selected levels:

# STEP 3: Extract tensor data from each pyramid level
tensor_hr = extract_pyramid_level(laplacian_pyramids, level_hr)
tensor_rr = extract_pyramid_level(laplacian_pyramids, level_rr)

The tensors contain the temporal evolution of each spatial location across all frames.Source: src/evm/evm_core.py:86-87

Temporal Filtering

Apply bandpass filters to isolate physiological frequency ranges:

# STEP 4: Separate temporal filtering per frequency band
filtered_tensor_hr = apply_temporal_bandpass(
    tensor_hr, LOW_HEART, HIGH_HEART, FPS, axis=0  # HR band: 0.8-3 Hz
)

filtered_tensor_rr = apply_temporal_bandpass(
    tensor_rr, LOW_RESP, HIGH_RESP, FPS, axis=0   # RR band: 0.2-0.8 Hz
)

Butterworth bandpass filters remove DC components and high-frequency noise.Source: src/evm/evm_core.py:90-96

Amplification

Multiply filtered signals by amplification factors:

# STEP 5: Signal amplification
filtered_tensor_hr *= self.alpha_hr  # α = 30
filtered_tensor_rr *= self.alpha_rr  # α = 50

Higher α for respiratory signals compensates for their lower natural amplitude.Source: src/evm/evm_core.py:99-100

Signal Extraction

Spatially average each frame to extract 1D temporal signals:

# STEP 6: Extract temporal signals
# HR: Green channel (best SNR for pulse)
signal_hr = extract_temporal_signal(filtered_tensor_hr, use_green_channel=True)

# RR: All channels average
signal_rr = extract_temporal_signal(filtered_tensor_rr, use_green_channel=True)

Green channel provides best signal-to-noise ratio for photoplethysmography.Source: src/evm/evm_core.py:103-107

Laplacian Pyramid Construction

Gaussian Pyramid

The Gaussian pyramid progressively downsamples the image:

def build_gaussian_pyramid(frame, levels=LEVELS_RPI):
    pyramid = []
    current = frame.astype(np.float32)
    pyramid.append(current)
    
    for _ in range(levels):
        current = cv2.pyrDown(current)  # Gaussian blur + 2x downsample
        pyramid.append(current)
    
    return pyramid

Source: src/evm/pyramid_processing.py:8-34

Laplacian Pyramid

The Laplacian pyramid captures details lost in downsampling:

def build_laplacian_pyramid(gaussian_pyramid):
    laplacian_pyramid = []
    
    for i in range(len(gaussian_pyramid) - 1):
        size = (gaussian_pyramid[i].shape[1], gaussian_pyramid[i].shape[0])
        expanded = cv2.pyrUp(gaussian_pyramid[i + 1], dstsize=size)
        laplacian = cv2.subtract(gaussian_pyramid[i], expanded)
        laplacian_pyramid.append(laplacian)
    
    # Last level is the same as Gaussian
    laplacian_pyramid.append(gaussian_pyramid[-1])
    
    return laplacian_pyramid

Source: src/evm/pyramid_processing.py:37-66

Why Laplacian Pyramids?

Laplacian pyramids provide several advantages:

Bandpass filtering: Each level captures a specific range of spatial frequencies
Computational efficiency: Smaller pyramid levels process faster
Reduced noise: High-frequency noise is separated into upper levels
Better amplification: Can apply different α values to different levels

Pyramid Level Extraction

The extract_pyramid_level() function collects a specific pyramid level across all frames:

def extract_pyramid_level(pyramid_stack, level):
    """
    Extract specific level from all pyramids and normalize dimensions.
    
    Returns:
        np.ndarray: Tensor (T x H x W x C) of specified pyramid level
    """
    level_frames = []
    
    for pyr in pyramid_stack:
        if level < len(pyr):
            level_frames.append(pyr[level])
        else:
            level_frames.append(pyr[-1])  # Fallback to last level
    
    # Determine target shape (most common shape)
    shapes = [frame.shape for frame in level_frames]
    target_shape = most_common(shapes)
    
    # Resize frames to target shape if needed
    level_resized = []
    for frame in level_frames:
        if frame.shape != target_shape:
            frame = cv2.resize(frame, (target_shape[1], target_shape[0]))
        level_resized.append(frame)
    
    return np.array(level_resized, dtype=np.float32)

Source: src/evm/pyramid_processing.py:129-174

The function includes dimension normalization to handle edge cases where pyramid levels might have slightly different dimensions due to rounding in downsampling.

Amplification Factor Selection

The amplification factor (α) is critical for balancing signal enhancement and noise:

Heart Rate (α = 30)

ALPHA_HR = 30  # Amplification factor for heart rate band

Moderate amplification suitable for subtle color changes
Too high → motion artifacts become visible
Too low → insufficient signal for reliable frequency detection

Respiratory Rate (α = 50)

ALPHA_RR = 50  # Amplification factor for respiratory rate band

Higher amplification needed for lower-frequency signals
Respiratory signals have naturally lower amplitude
Lower frequency range is less susceptible to motion artifacts

Source: src/config.py:3-10

Performance Characteristics

Computational Complexity

Traditional Two-Pass Approach

Pyramid construction: O(N × M × L) per pass × 2 passes
Temporal filtering: O(N × M × L) per band × 2 bands
Total: ~2× the work needed

Where:

N = number of frames
M = pixels per frame
L = pyramid levels

Optimized Single-Pass Approach

Pyramid construction: O(N × M × L) once
Level extraction: O(N × M/4^l) per level
Temporal filtering: O(N × M/4^l) per band
Total: ~50-60% faster

The key savings come from:

Single pyramid construction
Processing smaller tensors (downsampled levels)
Parallel filtering of independent bands

Memory Usage

# Typical memory footprint for 30 frames @ 320x240 resolution:
frames = 30 × 320 × 240 × 3 bytes = 6.9 MB (input)
pyramids = frames × (1 + 0.25 + 0.0625 + ...) ≈ 9.2 MB
level_3 = 30 × 40 × 30 × 3 × 4 bytes = 432 KB
level_2 = 30 × 80 × 60 × 3 × 4 bytes = 1.7 MB

The system maintains only one pyramid stack in memory, significantly reducing memory footprint compared to dual-pass approaches.

Usage Example

from src.evm.evm_manager import process_video_evm_vital_signs

# Process video frames
video_frames = [...]  # List of BGR frames from ROI

results = process_video_evm_vital_signs(video_frames, verbose=True)

if results['heart_rate']:
    print(f"Heart Rate: {results['heart_rate']:.1f} BPM")

if results['respiratory_rate']:
    print(f"Respiratory Rate: {results['respiratory_rate']:.1f} RPM")

Source: src/evm/evm_manager.py:12-103

Signal Processing

Learn about temporal filtering and FFT analysis

Face Detection

Understand ROI extraction and stabilization

System Overview

See how EVM fits into the overall architecture

API Reference

Explore the EVMProcessor API

Get Started

Core Concepts

Guides

Benchmarking

What is Eulerian Video Magnification?

Key Principles

Spatial Decomposition

Temporal Filtering

Signal Amplification

Reconstruction

EVM for Vital Signs Monitoring

Frequency Characteristics

Dual-Band Processing Architecture

Single-Pass Optimization

Processing Pipeline

Laplacian Pyramid Construction

Gaussian Pyramid

Laplacian Pyramid

Pyramid Level Extraction

Amplification Factor Selection

Heart Rate (α = 30)

Respiratory Rate (α = 50)

Performance Characteristics

Computational Complexity

Memory Usage

Usage Example

Signal Processing

Face Detection

System Overview

API Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Benchmarking

​What is Eulerian Video Magnification?

​Key Principles

Spatial Decomposition

Temporal Filtering

Signal Amplification

Reconstruction

​EVM for Vital Signs Monitoring

​Frequency Characteristics

​Dual-Band Processing Architecture

​Single-Pass Optimization

​Processing Pipeline

​Laplacian Pyramid Construction

​Gaussian Pyramid

​Laplacian Pyramid

​Pyramid Level Extraction

​Amplification Factor Selection

​Heart Rate (α = 30)

​Respiratory Rate (α = 50)

​Performance Characteristics

​Computational Complexity

​Memory Usage

​Usage Example

​Related Concepts

Signal Processing

Face Detection

System Overview

API Reference

Build docs developers (and LLMs) love

What is Eulerian Video Magnification?

Key Principles

EVM for Vital Signs Monitoring

Frequency Characteristics

Dual-Band Processing Architecture

Single-Pass Optimization

Processing Pipeline

Laplacian Pyramid Construction

Gaussian Pyramid

Laplacian Pyramid

Pyramid Level Extraction

Amplification Factor Selection

Heart Rate (α = 30)

Respiratory Rate (α = 50)

Performance Characteristics

Computational Complexity

Memory Usage

Usage Example

Related Concepts