Skip to main content

Overview

The EVM Vital Signs Monitor supports four face detection models, each with different trade-offs between accuracy, speed, and resource requirements. This guide helps you choose the right detector for your use case.

Available Detectors

MediaPipe

Fast and accurate - recommended for Raspberry Pi

YOLOv8/v12

Highest accuracy - best for challenging conditions

Haar Cascade

Lightweight - minimal resource usage

MTCNN

Robust to pose variations

Performance Comparison

Performance metrics measured on Raspberry Pi 4 (4GB RAM):
DetectorFPSAccuracyMemoryCPU UsageRecommended Use
MediaPipe30-40High150 MB40-50%Real-time, embedded
YOLOv8n15-25Very High300 MB70-80%Challenging lighting
YOLOv12n12-20Very High320 MB75-85%Maximum accuracy
Haar Cascade45-60Medium50 MB20-30%Low-power devices
MTCNN8-15High400 MB80-90%Offline processing
FPS measurements are for detection only. End-to-end system performance (including EVM) is ~3-4 seconds per measurement regardless of detector.

Detailed Comparison

MediaPipe Face Detection

Best for: Real-time applications, Raspberry Pi deployment, production use
Google’s MediaPipe provides a lightweight neural network optimized for mobile and edge devices.Strengths:
  • Excellent speed-accuracy balance
  • Low memory footprint
  • Works well in varied lighting
  • Official TensorFlow Lite optimization
Weaknesses:
  • May miss faces at extreme angles (>45°)
  • Struggles with partial occlusions

YOLOv8n / YOLOv12n

Best for: Maximum accuracy, challenging lighting, professional applications
Ultralytics YOLO models provide state-of-the-art face detection with high accuracy.Strengths:
  • Highest detection accuracy
  • Robust to difficult lighting conditions
  • Excellent with partial occlusions
  • Handles multiple faces well
Weaknesses:
  • Higher computational cost
  • Larger memory footprint
  • Requires model weight files (~6 MB per model)

Haar Cascade

Best for: Ultra-low-power devices, battery-powered applications
OpenCV’s classic face detector using Viola-Jones algorithm.Strengths:
  • Extremely fast
  • Minimal memory usage
  • No external model files needed
  • Well-tested and stable
Weaknesses:
  • Lower accuracy than neural models
  • Sensitive to face angle and lighting
  • More false positives
  • Requires frontal faces

MTCNN

Best for: Offline analysis, research applications, high pose variation
Multi-task Cascaded Convolutional Networks - a three-stage deep learning detector.Strengths:
  • Excellent with pose variations
  • Returns facial landmarks (eyes, nose, mouth)
  • High accuracy in challenging conditions
  • Robust to scale variations
Weaknesses:
  • Slowest detector
  • Highest memory usage
  • Requires TensorFlow
  • Not suitable for real-time on RPi

Decision Tree

Use this decision tree to choose your detector:
1

Identify your constraints

  • Platform: Raspberry Pi, desktop, or embedded?
  • Real-time requirement: < 100ms latency?
  • Power budget: Unlimited or battery-powered?
2

Consider your environment

  • Lighting: Controlled or variable?
  • Subject movement: Stationary or moving?
  • Face angles: Frontal or varied poses?
3

Choose based on priority

  • Speed first: Haar Cascade or MediaPipe
  • Accuracy first: YOLO or MTCNN
  • Balance: MediaPipe (recommended)

Recommendation by Use Case

Production Deployment on Raspberry Pi

detector = FaceDetector(model_type="mediapipe")
Why: Best speed-accuracy balance, proven reliability, low resource usage.

Clinical/Medical Application

detector = FaceDetector(
    model_type="yolo",
    preset="yolov8n",
    confidence=0.6
)
Why: High accuracy critical, controlled environment allows higher compute cost.

Battery-Powered IoT Device

detector = FaceDetector(
    model_type="haar",
    scale_factor=1.2,
    min_neighbors=4
)
Why: Minimal power consumption, acceptable accuracy for stationary subjects.

Research/Offline Analysis

detector = FaceDetector(
    model_type="mtcnn",
    min_confidence=0.9
)
Why: Maximum accuracy, facial landmarks available, no real-time constraints.

Challenging Lighting Conditions

detector = FaceDetector(
    model_type="yolo",
    preset="yolov12n",
    confidence=0.5
)
Why: YOLO models excel in low-light and high-contrast scenarios.

Benchmarking Your Detector

Test detector performance on your hardware:
import time
import cv2
from src.face_detector.manager import FaceDetector

def benchmark_detector(model_type, num_frames=100):
    detector = FaceDetector(model_type=model_type)
    cap = cv2.VideoCapture(0)
    
    times = []
    detections = 0
    
    for _ in range(num_frames):
        ret, frame = cap.read()
        if not ret:
            break
        
        start = time.time()
        roi = detector.detect_face(frame)
        elapsed = time.time() - start
        
        times.append(elapsed)
        if roi:
            detections += 1
    
    cap.release()
    detector.close()
    
    avg_time = sum(times) / len(times)
    fps = 1 / avg_time
    detection_rate = detections / num_frames * 100
    
    print(f"\n{model_type.upper()} Results:")
    print(f"  Average FPS: {fps:.1f}")
    print(f"  Avg time per frame: {avg_time*1000:.1f} ms")
    print(f"  Detection rate: {detection_rate:.1f}%")

# Run benchmarks
for detector in ["mediapipe", "yolo", "haar", "mtcnn"]:
    try:
        benchmark_detector(detector)
    except Exception as e:
        print(f"\n{detector} failed: {e}")

Switching Detectors

The unified FaceDetector interface makes switching detectors trivial:
from src.face_detector.manager import FaceDetector

# Configuration-based selection
CONFIG = {
    "development": "mediapipe",
    "production": "yolo",
    "testing": "haar"
}

ENV = "production"  # Change based on deployment

detector = FaceDetector(model_type=CONFIG[ENV])

# Rest of code remains identical
roi = detector.detect_face(frame)

Build docs developers (and LLMs) love