Combined Audio/Video Streaming
This advanced example demonstrates how to simultaneously stream video from a USB camera (UVC) while capturing and playing audio (UAC) - essentially creating a complete USB multimedia system on ESP32.What This Example Demonstrates
- Simultaneous UVC camera and UAC audio streaming
- Concurrent microphone capture and speaker output
- Efficient resource management for multiple streams
- Coordinating callbacks for different streams
- Performance optimization techniques
- Real-world application patterns
Hardware Setup
Required Components:- ESP32-S3 development board (recommended for better performance)
- USB camera with UVC support
- USB microphone (UAC-compatible)
- USB speaker or audio output (UAC-compatible)
- Powered USB hub (recommended for multiple devices)
- Adequate power supply (5V 2A minimum)
- Each device draws power from USB
- Camera: 200-500mA
- Microphone: 50-100mA
- Speaker: 100-500mA
- Total: Use powered USB hub for stability
Complete Code
Code Explanation
1. Global Statistics Tracking
- Monitor stream health (are all callbacks firing?)
- Detect performance issues (frame rate drops)
- Debug resource usage
- Verify synchronization
2. Coordinated Callbacks
Each stream has its own callback running independently:- Callbacks run on different threads
- Keep each callback fast (< 10ms)
- Avoid blocking operations
- Use thread-safe data structures for inter-callback communication
3. Memory Allocation Strategy
- Total RAM: ~400KB
- After buffers: ~220KB free
- Monitor with
ESP.getFreeHeap() - Use PSRAM for larger buffers if available
4. Configuration for Multiple Streams
- Configure UVC first
- Configure UAC second
- Register all callbacks
- Call
start()once (starts everything)
5. Resource Monitoring
- Decreasing free heap (memory leak)
- Callbacks not incrementing (stream stopped)
- Uneven frame rates (performance issue)
Advanced Usage Patterns
Pattern 1: Audio Echo (Microphone → Speaker)
Create a real-time audio passthrough:Pattern 2: Video Recording with Audio
Synchronize audio and video recording:Pattern 3: WiFi Streaming (Video + Audio)
Stream to network client:Pattern 4: Motion Detection with Audio Alert
Detect motion in video, trigger audio alert:Performance Optimization
1. Reduce Frame Rate
Lower FPS reduces CPU load:2. Lower Camera Resolution
3. Minimize Callback Processing
Bad (blocks callback):4. Use PSRAM (ESP32-S3)
If your ESP32-S3 has PSRAM:Troubleshooting
Problem: Only camera or audio works, not both
Cause: Configuration order or missing callback registration. Solution:Problem: System crashes or resets
Causes:- Memory allocation failed
- Buffer overflow
- Stack overflow in callback
Problem: Audio/video out of sync
Cause: Different callback rates and buffering delays. Solution: Add timestamps to synchronize:Problem: Dropped frames
Symptoms: Sequence numbers skip in camera callback. Solutions:- Reduce frame rate
- Increase buffer sizes
- Minimize callback processing
- Use lower resolution
- Check USB hub power
Real-World Applications
1. Video Conferencing Device
- Camera: Capture video
- Microphone: Capture voice
- Speaker: Play remote audio
- WiFi: Stream bidirectional A/V
2. Security Camera with Audio
- Camera: Motion detection
- Microphone: Audio events (glass breaking, etc.)
- Speaker: Two-way communication
- SD Card: Local recording
3. Baby Monitor
- Camera: Night vision camera
- Microphone: Cry detection
- Speaker: Soothing sounds
- WiFi: Stream to phone app
4. Podcast Recording Studio
- Camera: Video recording
- Microphone: Voice capture
- Speaker: Monitoring playback
- SD Card: High-quality recording
Related Examples
UVC Camera
Camera-only streaming basics
UAC Microphone
Microphone capture fundamentals
UAC Speaker
Speaker output basics