Skip to main content

Overview

EmoChat’s frontend provides an elegant, user-friendly interface for real-time emotion recognition. The interface combines HTML5 video capture, JavaScript event handling, and REST API communication to create a seamless emotion tracking experience.

Interface Components

The web application consists of several key sections defined in index.html:

Hero Section

A welcoming landing area with call-to-action buttons and emotional support messaging:
<h1 class="hero-title reveal">
  "No tienes que hacerlo perfecto,<br />
  solo tienes que empezar."
</h1>

Emotion Cards

Educational cards explaining the four base emotions (Alegría, Tristeza, Enojo, Sorpresa).

Interactive Webcam Section

The core feature where users activate their camera and receive real-time emotion analysis.

Webcam Integration

HTML Structure

The webcam interface consists of:
<div class="webcam-container" id="webcam-container">
  <video id="webcam" autoplay playsinline></video>
  <canvas id="canvas" style="display: none;"></canvas>
  <button id="fullscreen-btn" class="fullscreen-btn" aria-label="Pantalla completa">

  </button>
  <div class="emotion-overlay">
    <span id="emotion-result">Iniciando cámara...</span>
  </div>
</div>
Key Elements:
  • Video element: Displays the live webcam feed
  • Canvas element: Hidden, used to capture frames for processing
  • Emotion overlay: Displays the detected emotion in real-time
  • Fullscreen button: Allows users to expand the webcam view

Control Buttons

<div class="webcam-controls">
  <button id="start-btn" class="btn btn-outline" aria-label="Activar cámara">
    Activar Cámara
  </button>
  <button id="stop-btn" class="btn btn-outline btn-stop" aria-label="Pausar cámara" disabled>
    Pausar
  </button>
  <button id="record-btn" class="btn btn-primary" aria-label="Grabar Análisis de 30s" disabled>
    Grabar Análisis (30s)
  </button>
</div>

JavaScript Interaction Flow

Initialization

The main JavaScript file (main.js) initializes on page load:
document.addEventListener('DOMContentLoaded', () => {
    const video = document.getElementById('webcam');
    const canvas = document.getElementById('canvas');
    const emotionResult = document.getElementById('emotion-result');
    const startBtn = document.getElementById('start-btn');
    const stopBtn = document.getElementById('stop-btn');
    
    let stream = null;
    let predictionInterval = null;
    // ... initialization continues
});

Starting the Webcam

When the user clicks “Activar Cámara”:
startBtn.addEventListener('click', async () => {
    try {
        stream = await navigator.mediaDevices.getUserMedia({ video: true });
        video.srcObject = stream;
        
        startBtn.disabled = true;
        stopBtn.disabled = false;
        recordBtn.disabled = false;
        emotionResult.textContent = 'Analizando...';

        video.addEventListener('loadeddata', () => {
            canvas.width = video.videoWidth;
            canvas.height = video.videoHeight;
            
            // Start sending frames every 1000ms (1 second)
            predictionInterval = setInterval(sendFrameForPrediction, 1000);
        });

    } catch (error) {
        console.error('Error accessing webcam:', error);
        emotionResult.textContent = 'Error: No se pudo acceder a la cámara.';
        // ... error handling
    }
});
Process:
  1. Request camera access via navigator.mediaDevices.getUserMedia()
  2. Attach video stream to <video> element
  3. Update button states (disable start, enable stop/record)
  4. Wait for video metadata to load
  5. Start frame capture interval (1 frame per second)

Frame Capture and Prediction

Every second, a frame is captured and sent to the Flask backend:
async function sendFrameForPrediction() {
    if (!video.videoWidth) return;

    const context = canvas.getContext('2d');
    context.drawImage(video, 0, 0, canvas.width, canvas.height);
    
    // Convert to base64 jpeg
    const dataUrl = canvas.toDataURL('image/jpeg', 0.8);

    try {
        const response = await fetch('/predict', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({ image: dataUrl })
        });

        const data = await response.json();
        if (data.emotion) {
            emotionResult.textContent = `Emoción detectada: ${data.emotion}`;
            
            // Update UI based on emotion
            const overlay = document.querySelector('.emotion-overlay');
            overlay.className = 'emotion-overlay';
            overlay.classList.add(`emotion-${data.emotion.toLowerCase()}`);
        }

    } catch (error) {
        console.error('Prediction error:', error);
    }
}
Steps:
  1. Draw current video frame to hidden canvas
  2. Convert canvas to base64-encoded JPEG (80% quality)
  3. Send to /predict endpoint via POST request
  4. Update emotion display with result
  5. Apply CSS class for visual feedback

Stopping the Webcam

stopBtn.addEventListener('click', () => {
    if (stream) {
        stream.getTracks().forEach(track => track.stop());
        video.srcObject = null;
    }
    if (predictionInterval) {
        clearInterval(predictionInterval);
        predictionInterval = null;
    }
    
    startBtn.disabled = false;
    stopBtn.disabled = true;
    recordBtn.disabled = true;
    emotionResult.textContent = 'Cámara pausada.';
});

Session Recording Feature

Context Input

Users provide context about what they’ll discuss:
<div class="webcam-context">
  <label for="session-context">Contexto (¿De qué vas a hablar?):</label>
  <textarea id="session-context" rows="2" 
            placeholder="Ej: Me gustaría hablar sobre cómo me sentí hoy en el trabajo...">
  </textarea>
</div>

Recording State Management

The application tracks recording state with several variables:
let isRecordingSession = false;
let recordedEmotions = [];
let recordingCountdown = 30;
let timerInterval = null;

Starting a Recording Session

function startRecording() {
    const contextText = sessionContextInput.value.trim();
    if (!contextText) {
        alert('Por favor, escribe de qué vas a hablar en el campo de contexto antes de grabar.');
        sessionContextInput.focus();
        return;
    }

    isRecordingSession = true;
    recordedEmotions = [];
    recordingCountdown = 30;
    
    recordBtn.textContent = 'Detener Grabación';
    sessionContextInput.disabled = true;
    
    recordingTimer.style.display = 'block';
    recordingTimer.textContent = `00:${recordingCountdown}`;
    
    timerInterval = setInterval(() => {
        recordingCountdown--;
        const formattedTime = recordingCountdown < 10 ? `0${recordingCountdown}` : recordingCountdown;
        recordingTimer.textContent = `00:${formattedTime}`;
        
        if (recordingCountdown <= 0) {
            stopRecordingAndAnalyze();
        }
    }, 1000);
}
Process:
  1. Validate that context is provided
  2. Initialize recording state
  3. Update UI (change button text, disable context input)
  4. Display countdown timer
  5. Start 30-second countdown
  6. Auto-stop and analyze when timer reaches zero

Tracking Emotions During Recording

As frames are processed during recording, emotions are stored:
if (isRecordingSession && emotionKey !== 'no face detected') {
    const emMap = {
        'angry': 'Enojo',
        'happy': 'Alegría',
        'sad': 'Tristeza',
        'surprised': 'Sorpresa'
    };
    recordedEmotions.push(emMap[emotionKey] || data.emotion);
}
Currently, the backend only returns “HAPPY” or “SAD” emotions. The mapping includes angry and surprised for future extensibility, but these emotions are not yet supported by the trained model.
This creates an array like: ["Tristeza", "Alegría", "Tristeza", ...] (currently limited to Happy/Sad emotions)

Analyzing the Session

After 30 seconds, the recorded data is sent to Gemini AI:
async function stopRecordingAndAnalyze() {
    stopRecording();
    
    const contextText = sessionContextInput.value.trim();
    if (recordedEmotions.length === 0) {
        alert('No se detectaron emociones durante la sesión.');
        return;
    }

    geminiResultText.textContent = '⏱️ Gemini está analizando tu discurso y tus expresiones faciales...';
    
    try {
        const response = await fetch('/analyze_session', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({
                context: contextText,
                emotions: recordedEmotions
            })
        });
        
        const data = await response.json();
        
        if (data.error) {
            geminiResultText.textContent = `❌ Error: ${data.error}`;
        } else if (data.analysis) {
            geminiResultText.textContent = data.analysis;
        }
    } catch (error) {
        console.error('Error in analyze_session:', error);
        geminiResultText.textContent = '❌ Ocurrió un error al contactar con la IA.';
    }
}

Error Handling

The frontend provides helpful error messages for common camera issues:
let errMsg = `No se pudo acceder a la cámara.\n\nError: ${error.name}\n`;
if (error.name === 'NotReadableError' || error.name === 'TrackStartError') {
    errMsg += "👉 POSIBLE CAUSA: Otra aplicación está usando la cámara.";
} else if (error.name === 'NotAllowedError') {
    errMsg += "👉 POSIBLE CAUSA: Denegaste el permiso de cámara.";
} else {
    errMsg += "👉 POSIBLE CAUSA: Debes acceder vía http://127.0.0.1:5000/";
}
alert(errMsg);

Privacy Notice

The interface emphasizes user privacy:
<p class="fineprint">
  Tus datos faciales <strong>no</strong> se guardan. 
  Solo se analizan en tiempo real para mostrarte la emoción.
</p>
All processing happens in real-time with no server-side storage of images or video.

Build docs developers (and LLMs) love