Overview
TCP Streamer implements a sophisticated audio streaming pipeline that captures audio from input devices and transmits raw PCM data over TCP with minimal latency. The system uses a lock-free ring buffer architecture and precision timing to ensure smooth, consistent audio delivery.Audio Pipeline Architecture
The streaming system consists of two independent threads:Producer Thread (Audio Capture)
- Captures audio via the
cpallibrary (cross-platform audio) - Converts all input formats (F32, I16, U16) to internal F32 format
- Pushes audio frames to a lock-free ring buffer
- Runs at the device’s native sample rate and buffer size
Consumer Thread (Network Transmission)
- Reads audio chunks from the ring buffer
- Converts F32 samples back to I16 PCM for transmission
- Sends data over TCP with precision pacing
- Monitors network quality and adjusts behavior dynamically
PCM Audio Format
TCP Streamer transmits audio in raw PCM format:Why F32 Internally?
As of version 1.8.0, TCP Streamer uses native F32 processing throughout the pipeline:Benefits of F32 Architecture
Benefits of F32 Architecture
- Eliminates clipping: F32 provides headroom beyond ±1.0 for intermediate calculations
- Better precision: No quantization errors during processing
- Linux compatibility: PipeWire and PulseAudio prefer F32 natively
- Future-proof: Enables potential DSP features without format conversions
Platform-Specific Format Detection
Platform-Specific Format Detection
The system automatically detects the best input format (audio.rs:1023-1042):Priority Order:
- F32 (preferred - native on PipeWire/Linux)
- I16 (standard - WASAPI/CoreAudio)
- U16 (fallback)
Token Bucket Pacing Algorithm
To prevent network micro-bursts and ensure mathematically perfect timing, TCP Streamer uses a Strict Clock Strategy for transmission pacing.How It Works
Calculate Tick Duration
Each audio chunk has a precise duration based on sample rate:Example: At 48 kHz with 1024-sample chunks:
- Tick duration = (1024 × 1,000,000) / 48,000 = 21,333 microseconds (21.3ms)
Why Precision Matters: Without strict pacing, the network thread would send bursts of packets, causing jitter spikes and potential buffer underruns on the receiver side. The token bucket algorithm ensures consistent packet timing with sub-millisecond accuracy.
Prefill Gate (Startup Buffering)
To eliminate “cold start” stuttering, TCP Streamer implements a prefill gate that waits for the buffer to fill before transmission begins (v1.8.1).Configuration
| Parameter | Value | Purpose |
|---|---|---|
| Prefill Duration | 1000ms | Ensures stable startup across all platforms |
| Check Interval | 10ms | Polling frequency for buffer level |
| Platforms | Windows, Linux, macOS | Works equally on all operating systems |
Connection Management
TCP Streamer uses advanced socket configuration for reliable streaming:Socket Options
Graceful Shutdown
To prevent zombie connections, TCP Streamer explicitly sends TCP FIN packets:Auto-Reconnect Logic
When disconnected, TCP Streamer uses exponential backoff with jitter:Configuration Options
Sample Rates
44.1 kHz
- Standard CD quality
- 1,411.2 kbps bitrate (stereo, 16-bit)
- Ideal for music playback
48 kHz
- Professional audio standard
- 1,536 kbps bitrate (stereo, 16-bit)
- Recommended for modern systems
Buffer Sizes (Hardware Latency)
| Buffer Size | Latency (48kHz) | Use Case |
|---|---|---|
| 256 samples | 5.3ms | Ultra-low latency (may cause dropouts) |
| 512 samples | 10.7ms | Low latency (balanced) |
| 1024 samples | 21.3ms | Standard (recommended) |
| 2048 samples | 42.7ms | High stability (WiFi/loaded systems) |
WASAPI Loopback: On Windows loopback mode, TCP Streamer uses
BufferSize::Default (audio.rs:1078-1082) because fixed buffer sizes often fail with loopback devices. The system relies on the larger ring buffer for stability instead.Ring Buffer Duration
The ring buffer absorbs network jitter and provides latency tolerance:- Ethernet (wired): 2000ms
- WiFi (standard): 4000-5000ms
- WiFi (poor signal): 8000ms+
- WASAPI Loopback: 8000ms (accounts for Windows timing variability)
Performance Characteristics
CPU Usage
Typical CPU Load:
- Producer thread: <1% CPU (audio capture is hardware-accelerated)
- Consumer thread: 1-3% CPU (depends on chunk size and sample rate)
- Total: ~2-4% CPU on modern systems
Memory Usage
Ring buffer memory consumption:Latency Breakdown
Troubleshooting
Audio dropouts or stuttering
Audio dropouts or stuttering
Causes:
- Ring buffer too small for network conditions
- CPU throttling (especially on laptops)
- Network congestion
- Increase ring buffer duration to 8000ms or higher
- Enable adaptive buffering (see Adaptive Buffering)
- Use Ethernet instead of WiFi if possible
- Enable high-priority thread option in Advanced settings
Connection keeps dropping
Connection keeps dropping
Causes:
- Server not responding
- Firewall blocking connection
- Write timeout triggered (5s)
- Enable auto-reconnect in Automation settings
- Check server logs for errors
- Verify firewall rules allow TCP on the specified port
- Test with
nc -l <port>to verify TCP connectivity
High CPU usage
High CPU usage
Causes:
- Small chunk size (more frequent processing)
- Low hardware buffer size (more audio callbacks)
- Increase chunk size to 2048 or 4096 in Advanced tab
- Increase hardware buffer size to 1024 or 2048
- Disable high-priority thread if not needed
Related Features
Silence Detection
Learn how RMS-based silence detection saves bandwidth
Adaptive Buffering
Automatic buffer sizing based on network jitter
Profiles
Save configurations for different streaming scenarios