Overview
The LangShazam backend is built with FastAPI, leveraging Python’s async capabilities for high-performance real-time audio processing. The architecture is modular, with clear separation of concerns across different components.Application Structure
Main Application Entry Point
The application is initialized inbackend/src/main.py:24-39 with FastAPI and CORS middleware:
The application uses dependency injection pattern, passing shared instances of
AudioProcessor and ServerMetrics to the WebSocketManager.Core Components
AudioProcessor
Location:backend/src/audio_processor.py
Purpose
Handles all audio processing and OpenAI API interactions with rate limiting and connection tracing.
Key Features
Rate Limiting with Semaphores
Rate Limiting with Semaphores
The AudioProcessor implements concurrent call limiting using asyncio semaphores:This prevents overloading the OpenAI API and ensures fair resource distribution across connections.Reference:
audio_processor.py:15-17OpenAI API Integration
OpenAI API Integration
Audio is sent to OpenAI’s Whisper-1 model for transcription:The method uses
asyncio.to_thread() to prevent blocking the event loop.Reference: audio_processor.py:19-42Connection Tracing
Connection Tracing
Every request includes a unique connection ID for tracking:This enables debugging and performance analysis for individual requests.Reference:
audio_processor.py:53, 41Processing Flow
WebSocketManager
Location:backend/src/websocket_manager.py
Purpose
Manages WebSocket connections, coordinates audio buffering, and handles the request-response lifecycle.
Connection Lifecycle
Connection Establishment
When a client connects, a unique 8-character connection ID is generated:Reference:
websocket_manager.py:19-22Audio Buffering
Incoming audio chunks are buffered until minimum size is reached:Reference:
websocket_manager.py:29-49Processing & Response
Once enough audio is buffered, it’s processed and results are sent:Reference:
websocket_manager.py:65-70Error Handling
websocket_manager.py:78-89
ServerMetrics
Location:backend/src/metrics.py
Purpose
Collects and reports server performance metrics including connections, processing times, CPU, and memory usage.
Tracked Metrics
metrics.py:15-21
Metric Categories
Connection Metrics
- Active connections
- Total requests
- Error count
Performance Metrics
- Average processing time
- Per-request timing (last 100)
System Metrics
- Memory usage (MB)
- CPU usage (total & per-core)
- Effective cores utilized
Process Metrics
- Active processes
- CPU core count
Metrics Logging
Detailed Metrics Output
Detailed Metrics Output
Metrics are logged with connection context:Reference:
metrics.py:23-44Request Flow & Lifecycle
Configuration System
Location:backend/src/config/settings.py
The backend uses a centralized configuration system with environment variable support:
- Server Config
- Audio Config
- OpenAI Config
- Logging Config
Error Handling and Logging
Logging Strategy
Structured Logging
The backend implements structured logging with connection tracing throughout the request lifecycle.
main.py:16-22):
Error Categories
WebSocket Errors
WebSocket Errors
Handled in
websocket_manager.py:78-89:- WebSocketDisconnect: Clean client disconnection
- Processing Errors: Errors during audio processing
- General Exceptions: Unexpected errors with full error reporting
API Errors
API Errors
Handled in
audio_processor.py:43-45:- OpenAI API failures
- Network timeouts
- Rate limit exceeded
Memory Management
The AudioProcessor explicitly calls garbage collection after processing to manage memory:
audio_processor.py:72
API Endpoints
WebSocket Endpoint
/ws
Protocol: WebSocket
Purpose: Real-time language detection
Handler:
Purpose: Real-time language detection
Handler:
websocket_endpointmain.py:51-57):
REST Endpoints
GET /
Health check endpointReference:
main.py:41-44GET /metrics
Server metrics endpointReference:
main.py:46-49Performance Optimizations
Async Processing
All I/O operations use async/await for non-blocking execution
Semaphore Rate Limiting
Prevents API overload with max 3 concurrent OpenAI calls
Audio Buffering
Waits for minimum 20KB before processing to reduce API calls
Memory Management
Explicit garbage collection after audio processing
Connection Pooling
Reuses HTTP connections through OpenAI SDK
Metrics Deque
Rolling window of last 100 processing times for efficiency
Related Documentation
Architecture Overview
High-level system architecture and component interaction
Frontend Architecture
React frontend implementation details

