Overview
TCP Streamer includes an intelligent silence detection system that monitors audio levels in real-time and can automatically stop transmission during quiet periods. This feature significantly reduces network bandwidth usage and prevents streaming of background noise or silent audio.Current Status (v1.8+): While the silence detection infrastructure exists in the codebase, the feature was simplified in v1.8.0 during the F32 architecture refactor. The RMS calculation and visual volume meter remain active for monitoring, but automatic transmission gating based on silence is not currently active in the latest release.This documentation describes the design and implementation of the silence detection system for future reference and potential re-enablement.
How It Works
RMS (Root Mean Square) Calculation
The system calculates the RMS (Root Mean Square) value of incoming audio to determine volume level. RMS is a standard measure of audio power that accounts for both positive and negative waveform values.RMS Scale and Typical Values
RMS values range from 0.0 (complete silence) to 1.0 (maximum volume):| RMS Value | Volume Level | Description |
|---|---|---|
| 0.0 - 0.01 | Silent | Background noise floor |
| 0.01 - 0.05 | Very Quiet | Ambient room noise |
| 0.05 - 0.15 | Quiet | Soft speech or music |
| 0.15 - 0.40 | Normal | Typical audio playback |
| 0.40 - 0.70 | Loud | High volume music |
| 0.70 - 1.0 | Very Loud | Near maximum level |
Default Threshold: The README mentions an RMS threshold of
50.0, but this appears to be scaled differently than the normalized 0.0-1.0 range. In practice, a threshold of 0.02-0.05 works well for detecting true silence while avoiding false positives from background noise.Visual Volume Indicator
TCP Streamer provides a real-time volume meter in the UI to help users configure the silence threshold correctly.How to Use the Volume Meter
Start Audio Playback
Play audio through the selected input device (or system audio if using loopback mode).
Note the Noise Floor
With no audio playing, observe the baseline noise level (white line on the meter).
Bandwidth Savings
Bitrate Calculation
TCP Streamer’s raw PCM audio uses significant bandwidth:Savings Scenarios
Music Playback
Typical Silence: 5-10% (gaps between tracks)Bandwidth Saved: 70-150 kbpsUse Case: Whole-home audio systems with occasional pauses
Voice/Podcast
Typical Silence: 30-50% (pauses between speech)Bandwidth Saved: 460-750 kbpsUse Case: Streaming radio, podcasts, or voice content
Gaming/Desktop
Typical Silence: 60-80% (no audio most of the time)Bandwidth Saved: 920-1,230 kbpsUse Case: Desktop audio capture when working/browsing
Overnight Streaming
Typical Silence: 95-100% (no user activity)Bandwidth Saved: 1,400-1,536 kbpsUse Case: Prevents wasting bandwidth when system is idle
Smart Deep Sleep Mode
As of v1.6.0, TCP Streamer includes Smart Deep Sleep functionality that auto-disconnects after prolonged silence to prevent “zombie” connections.Configuration
Behavior
Why This Matters: Without deep sleep, a streaming client could maintain an idle TCP connection for hours or days, consuming server resources and potentially causing issues with connection-limited servers (e.g., Snapcast source slot limits).
Implementation Details
Audio Processing Flow
Here’s how silence detection integrates with the audio pipeline:RMS Smoothing (EWMA)
To prevent rapid on/off toggling due to momentary spikes or dips, RMS values are often smoothed using an Exponential Weighted Moving Average (EWMA):- Lower alpha (e.g., 0.1): More smoothing, slower response to changes
- Higher alpha (e.g., 0.5): Less smoothing, faster response to changes
Hysteresis (Debouncing)
To avoid rapid switching between silent and non-silent states, the system implements hysteresis:Configuration Options
UI Settings (main.js)
These settings are available in the frontend configuration:| Setting | Type | Default | Description |
|---|---|---|---|
silence_threshold | Float | 0.02-0.05 | RMS level below which audio is silent |
silence_timeout_seconds | Integer | 300 | Seconds of silence before auto-disconnect |
Backend Constants (audio.rs)
These values are hardcoded in the Rust backend:Version Note: The exact silence detection implementation may vary between versions. Always refer to the specific version’s source code for authoritative configuration values.
Performance Impact
CPU Overhead
RMS calculation adds minimal CPU overhead:- Cost per sample: 1 multiplication + 1 addition
- Cost per buffer: 1 division + 1 square root
- Typical overhead: <0.1% CPU
Memory Usage
Silence detection state requires minimal memory:Use Cases
Multi-Room Audio (Snapcast)
Multi-Room Audio (Snapcast)
Scenario: Multiple rooms streaming music from a central server.Benefit: When music stops, clients automatically disconnect, freeing up Snapcast source slots. When music resumes, clients auto-reconnect.Configuration:
- Threshold: 0.03 (just above room noise)
- Timeout: 120 seconds (2 minutes)
- Auto-reconnect: Enabled
Desktop Audio Capture
Desktop Audio Capture
Scenario: Streaming computer audio to a remote speaker, but computer is often idle.Benefit: No bandwidth wasted during work/browsing sessions without audio. Connection reestablishes automatically when audio plays.Configuration:
- Threshold: 0.05 (to avoid triggering on notification sounds)
- Timeout: 300 seconds (5 minutes)
- Auto-reconnect: Enabled
Podcast/Voice Streaming
Podcast/Voice Streaming
Scenario: Streaming podcast or radio content with frequent pauses.Benefit: 30-50% bandwidth reduction by skipping silent gaps between speech.Configuration:
- Threshold: 0.02 (to capture quiet speech)
- Timeout: Disabled (don’t disconnect, just skip silence)
- Auto-reconnect: N/A
Always-On Monitoring
Always-On Monitoring
Scenario: Security camera audio or baby monitor.Benefit: Conserve bandwidth and storage by only streaming when sound is detected.Configuration:
- Threshold: 0.10 (only trigger on significant sounds)
- Timeout: 30 seconds (quick reconnect)
- Auto-reconnect: Enabled
Troubleshooting
Silence detection too sensitive (cutting off quiet audio)
Silence detection too sensitive (cutting off quiet audio)
Symptoms:
- Quiet music passages are skipped
- Soft speech is not transmitted
- Frequent disconnects during normal listening
- Lower the silence threshold (e.g., from 0.05 to 0.02)
- Increase the timeout before disconnect (e.g., from 120s to 300s)
- Check input device volume/gain settings
Silence detection not working (always streaming)
Silence detection not working (always streaming)
Symptoms:
- Bitrate never drops to zero
- Connection never auto-disconnects
- No “Silence detected” log messages
- Verify silence detection is enabled in settings
- Check that threshold is set above 0.0 (disabled)
- Ensure input device doesn’t have constant background noise
- Try increasing threshold (e.g., from 0.02 to 0.05)
Rapid connect/disconnect cycles
Rapid connect/disconnect cycles
Symptoms:
- Connection toggles on and off every few seconds
- Log shows alternating “Silence detected” and “Audio resumed”
- This indicates hysteresis is not implemented or insufficient
- Increase silence timeout to add delay before disconnect
- Apply EWMA smoothing to RMS values (alpha=0.2-0.3)
- Use separate enter/exit thresholds with a gap
Future Enhancements
Adaptive Thresholds
Automatically adjust silence threshold based on detected noise floor over time.
Frequency-Based Detection
Use FFT to detect specific frequency ranges (e.g., ignore HVAC hum but detect speech).
VAD (Voice Activity Detection)
More sophisticated algorithm to distinguish speech from noise and music.
Configurable Smoothing
Allow users to adjust EWMA alpha and hysteresis gap in UI.
Related Features
Audio Streaming
Learn about the core PCM streaming pipeline
Adaptive Buffering
Dynamic buffer sizing for network stability
Profiles
Save silence detection settings per profile