Overview
The Unmute backend is a FastAPI application that orchestrates real-time voice conversations by coordinating between the frontend, STT, LLM, and TTS services. Technology Stack:- Framework: FastAPI (async Python web framework)
- WebSocket: Native FastAPI WebSocket support
- Async: Python asyncio for concurrent operations
- Serialization: Pydantic for message validation, MessagePack for STT/TTS
- Monitoring: Prometheus metrics
- Audio: sphn (Opus codec), NumPy (processing)
Application Structure
Main File:unmute/main_websocket.py
Request Lifecycle
HTTP Endpoints
WebSocket Endpoint
Concurrency Control
File:main_websocket.py:69
- Python GIL limits true parallelism
- Better to scale horizontally (more backend instances)
- Prevents resource exhaustion
WebSocket Protocol
Subprotocol Negotiation
File:main_websocket.py:314
Message Validation
File:main_websocket.py:79
- Type safety
- Automatic validation
- Clear error messages
Two-Loop Architecture
File:main_websocket.py:380
The backend uses two concurrent loops:
- Receive Loop: Handle incoming messages from client
- Emit Loop: Send messages to client
- All tasks cancelled if one fails
- Automatic exception propagation
- Clean shutdown
Receive Loop
File:main_websocket.py:406
Opus Decoding
File:main_websocket.py:461
asyncio.to_thread:
- Opus decoding is CPU-bound
- Run in thread pool to avoid blocking event loop
- Other connections can process concurrently
Reconnection Handling
File:main_websocket.py:462
Emit Loop
File:main_websocket.py:512
Opus Encoding
File:main_websocket.py:550
Error Handling
Exception Reporter
File:main_websocket.py:334
CORS Error Handling
File:main_websocket.py:594
Health Checks
File:main_websocket.py:137
- Parallel health checks (TaskGroup)
- Cached for 0.5s (avoid hammering services)
- Used before accepting WebSocket connections
CORS Configuration
File:main_websocket.py:84
Configuration
File:unmute/kyutai_constants.py
Deployment
Docker Compose
File:docker-compose.yml:32
Dockerfile
File:Dockerfile
Running Locally
Monitoring
Prometheus Metrics
File:main_websocket.py:74
/metricsendpoint- HTTP request duration, status codes, etc.
- Custom metrics from
unmute/metrics.py
Grafana Dashboard
File:services/grafana/dashboards/unmute-monitoring-*.json
Pre-configured dashboard for:
- Active sessions
- Latency percentiles (STT TTFT, TTS TTFT, VLLM TTFT)
- Error rates
- Throughput (words/sec)
Next Steps
- Frontend - Client-side implementation
- Data Flow - End-to-end message flow
- Components - Component details