Overview
EmoChat uses computer vision techniques to detect and classify human emotions in real-time through facial analysis. The system captures facial expressions from a webcam feed, extracts facial landmarks, normalizes the features, and classifies them into emotion categories.How It Works
The emotion recognition pipeline consists of four main stages:- Face Detection - Detect faces in the video frame
- Landmark Extraction - Extract 68 facial landmark points
- Feature Normalization - Normalize coordinates for consistent analysis
- Emotion Classification - Predict emotion using the trained ML model
Facial Landmark Detection
OpenCV Implementation
EmoChat uses OpenCV’s face detection and landmark extraction capabilities, specifically:- Haar Cascade Classifier for face detection
- LBF (Local Binary Features) Model for 68-point facial landmark detection
utils.py:59:
The system automatically downloads the required model files (
haarcascade_frontalface_default.xml and lbfmodel.yaml) from OpenCV’s repository if they don’t exist locally.68 Facial Points
The LBF model detects 68 specific facial landmark points that capture:- Jawline contour (17 points)
- Eyebrow shapes (10 points)
- Nose bridge and tip (9 points)
- Eye contours (12 points)
- Mouth outline (20 points)
Feature Extraction Process
Step 1: Face Detection
The Haar Cascade classifier scans the grayscale image to detect faces:scaleFactor=1.1- Image pyramid scaling factorminNeighbors=5- Minimum neighbors required for face detection (reduces false positives)
Step 2: Landmark Fitting
Once a face is detected, the LBF model fits 68 landmarks to the facial region:Step 3: Feature Normalization
Raw landmark coordinates vary based on face position and size. EmoChat normalizes these coordinates to make them position and scale-invariant:Why Normalization?
Normalizing coordinates makes the model invariant to:
- Face size (distance from camera)
- Face position (location in frame)
- Image resolution
Feature Vector Output
The final feature vector contains 136 values (68 points × 2 coordinates):[0, 1], representing normalized positions within the facial bounding box.
Currently Supported Emotions
EmoChat currently recognizes 2 core emotions:Happy
Detected when facial features show:
- Raised cheek muscles
- Mouth corners elevated
- Crow’s feet around eyes
Sad
Detected when facial features show:
- Downturned mouth corners
- Lowered eyebrows
- Relaxed facial muscles
The emotion labels are defined in The model outputs an integer index (0 or 1) which maps to these labels.
app.py:19 as:Real-time Processing
Webcam Integration
The JavaScript frontend (main.js) captures frames from the webcam every 1 second:
Frame Processing Flow
- Capture - JavaScript captures frame from webcam video element
- Encode - Frame is converted to JPEG and Base64 encoded
- Send - Data is sent to Flask
/predictendpoint via HTTP POST - Decode - Backend decodes Base64 to image array
- Extract -
get_face_landmarks()extracts normalized features - Predict - Model classifies the emotion
- Return - Emotion label is sent back to frontend
- Display - UI updates with detected emotion
Error Handling
No Face Detected
When no face is found in the frame:{"emotion": "No face detected"}
Invalid Input
The function validates image format before processing:Visualization (Optional)
For debugging, landmarks can be drawn on the image:test_model.py for real-time visualization during development.
Next Steps
ML Model
Learn how the Random Forest classifier is trained and makes predictions
Architecture
Understand the complete system architecture and data flow

