Audio Issues

Microphone Issues

Klaus not detecting my microphone

List available microphones:

python3 -c "import sounddevice as sd; print(sd.query_devices())"

Example output:

  0 Built-in Microphone, Core Audio (2 in, 0 out)
> 1 Built-in Output, Core Audio (0 in, 2 out)
  2 USB Audio Device, Core Audio (1 in, 0 out)

The > indicates the system default device.To select a specific microphone:Edit ~/.klaus/config.toml:

mic_index = 0   # Built-in Microphone
# mic_index = 2  # USB Audio Device
# mic_index = -1 # System default (default)

Reference: config.py:38, device_catalog.py

Microphone level too low

Klaus uses the system default microphone with no automatic gain control.Solutions:

Increase system mic level:
- macOS: System Settings > Sound > Input > Input volume
- Windows: Settings > System > Sound > Input device > Device properties > Volume

Adjust VAD RMS threshold if your mic is inherently quiet:

# Lower threshold to accept quieter speech (default: -45.0)
vad_min_rms_dbfs = -50.0

Use push-to-talk mode instead of voice activation to bypass VAD filtering

Reference: audio.py:404-412

How do I switch microphones mid-session?

Klaus supports live microphone switching via the Settings dialog:

Click the settings button (gear icon)
Go to the “Microphone” tab
Select a different mic from the dropdown
Changes apply immediately (VAD stream restarts with new device)

If the new mic fails to open, Klaus automatically rolls back to the previous mic.Persisted to config: The new mic index is saved to ~/.klaus/config.toml.Reference: main.py:856-867, settings_dialog.py, device_switch.py

Audio input status warnings in logs

If you see:

Audio input status: <some flag>

This indicates a sounddevice callback status flag:

Input overflow: Audio buffer overrun (frames dropped)
Input underflow: Not enough data available

Usually harmless, but if frequent:

Close other audio-intensive applications
Check CPU usage
Try a different USB port for USB microphones

Reference: audio.py:72-73, audio.py:284-286

Voice Activation (VAD) Issues

Voice Activation uses WebRTC VAD with multi-stage filteringKlaus applies several quality gates before sending audio to STT:

WebRTC VAD (voiced vs unvoiced frames)
Minimum duration check
Voiced frame ratio check
RMS loudness check (dBFS)
Contiguous voiced run check

If any gate fails, the audio is discarded and Klaus logs the reason.

VAD not detecting my speech

Symptoms:

Klaus stays in “Idle” state when you speak
Log shows: VAD: discarding low-voice utterance

Solutions:

Lower VAD sensitivity (0-3, lower = less aggressive filtering):
```
vad_sensitivity = 2  # Default is 3
```

Reduce minimum thresholds:

vad_min_duration = 0.3         # Default: 0.5s
vad_min_voiced_ratio = 0.20    # Default: 0.28
vad_min_voiced_frames = 5      # Default: 8
vad_min_rms_dbfs = -50.0       # Default: -45.0
vad_min_voiced_run_frames = 4  # Default: 6

Check microphone level (see “Microphone level too low” above)
Use push-to-talk mode as a workaround

Reference: audio.py:105-131, config.py:62-80

VAD triggering on background noise

Symptoms:

Klaus enters “Listening” state when no one is speaking
Fan noise, keyboard typing, or room noise triggers VAD

Solutions:

Increase VAD sensitivity:

vad_sensitivity = 3  # Maximum filtering

Increase minimum thresholds:

vad_min_duration = 0.7         # Require longer speech
vad_min_voiced_ratio = 0.35    # Require more voiced content
vad_min_voiced_frames = 10     # More voiced frames
vad_min_rms_dbfs = -40.0       # Higher loudness threshold
vad_min_voiced_run_frames = 8  # Longer contiguous voiced run

Reduce room noise:
- Use a directional microphone
- Enable noise cancellation in system settings (if available)
- Move away from fans, HVAC vents, mechanical keyboards

Reference: audio.py:363-415

Understanding VAD discard reasons

Klaus logs structured discard events. Check logs for:

klaus 2>&1 | grep "VAD: discarding"

Discard reasons:

Reason	Meaning	Fix
`vad_short_duration`	Utterance < `vad_min_duration`	Lower `vad_min_duration`
`vad_low_voiced_frames`	Fewer than `vad_min_voiced_frames`	Lower `vad_min_voiced_frames`
`vad_low_voiced_ratio`	Voiced ratio < `vad_min_voiced_ratio`	Lower `vad_min_voiced_ratio`
`quality_short_voiced_run`	Max contiguous run < `vad_min_voiced_run_frames`	Lower `vad_min_voiced_run_frames`
`quality_low_rms`	Loudness < `vad_min_rms_dbfs`	Lower `vad_min_rms_dbfs` or increase mic level

Log format:

STT guard event=vad_discard reason=quality_low_rms session=<id> vad_discarded=5 quality_gate_discarded=3

Reference: audio.py:363-415, main.py:418-432

VAD silence timeout too short/long

Klaus finalizes speech after a period of silence.Default: 1.5 secondsTo adjust:

# Shorter timeout (faster finalization, may cut off pauses)
vad_silence_timeout = 1.0

# Longer timeout (wait for longer pauses, slower response)
vad_silence_timeout = 2.0

Reference: config.py:64, audio.py:126

Push-to-Talk Issues

Push-to-talk not recording

Symptoms:

Press and hold F2 (or configured hotkey), but Klaus stays in “Idle”
Release key, no transcription appears

Causes:

Global hotkey not working (see macOS Permissions)
Wrong input mode - Klaus is in voice-activation mode
Microphone not accessible

Solutions:

Verify input mode:
- Check status widget shows “Push-to-Talk”
- Toggle mode with toggle_key or mode button
Use in-app hotkey if global hotkey fails:
- Focus Klaus window
- Press and hold configured key

Check logs:

klaus 2>&1 | grep "Recording started"

Expected output:

Recording started (mic: Built-in Microphone, 16000 Hz)

Reference: main.py:734-759, audio.py:36-99

Recording stopped (no audio captured)

If you see:

Recording stopped (no audio captured)

You pressed and released the PTT key too quickly (no audio frames buffered).Solution: Hold the key longer (at least 0.3-0.5 seconds).Reference: audio.py:77-95

Text-to-Speech (TTS) Issues

No audio output when Klaus speaks

Symptoms:

Klaus transitions to “Speaking” state
No audio plays
Logs show TTS generation but no playback errors

Causes:

System output muted - Check system volume
Wrong output device selected - Klaus uses system default
Audio device in use - Close other apps using audio output

Solutions:

Check system audio:
- Ensure volume not muted
- Play a test sound from system settings
Verify default output device:
```
python3 -c "import sounddevice as sd; print(sd.query_devices())"
```
Look for the device marked with > (default output).
Test TTS independently:
- Click replay button on a previous response in chat
- If replay works, the issue is with live synthesis

Reference: tts.py, audio.py:452-486

TTS playback crackling or stuttering (macOS)

Klaus uses a persistent sd.OutputStream to avoid CoreAudio latency.The VAD mic stream is suspended during TTS playback to free the audio device.If crackling persists:

Check sample rate mismatch:
- TTS output is 24000 Hz
- Ensure output device supports 24 kHz or has good resampling
Close competing audio apps:
- DAWs, music production software, voice chat apps
Increase TTS latency (code change required):
- Klaus uses latency='high' on macOS by default
- Further tuning requires editing tts.py

Reference: tts.py, audio.py:245-276

TTS voice or speed settings not applying

Voice and speed are set in ~/.klaus/config.toml:

# TTS voice (default: cedar)
# Options: coral, nova, alloy, ash, ballad, echo, fable, onyx, sage, shimmer, verse, cedar, marin
voice = "nova"

# TTS playback speed 0.25-4.0 (default: 1.0)
tts_speed = 1.2

After editing, restart Klaus or change settings via the Settings dialog (which auto-reloads).Reference: config.py:50-55, config.py:192-196

TTS cuts off mid-sentence

Causes:

User interrupted - Pressed stop button or started speaking (in voice-activation mode)
Network error - OpenAI TTS API timeout or connection lost
Audio device disconnected

Check logs:

klaus 2>&1 | grep -i "tts\|speaking"

Look for:

“Stop requested via UI” - User clicked stop
API error messages - Network or OpenAI issues

Reference: main.py:833-837, tts.py

Speech-to-Text (STT) Issues

Transcription is inaccurate

Klaus uses Moonshine Voice (local STT) with configurable model size.Default model: medium (245M params, ~300ms latency)To improve accuracy, use a larger model:Edit ~/.klaus/config.toml:

stt_moonshine_model = "medium"  # Options: tiny, small, medium

Trade-offs:

tiny: Fastest, least accurate
small: Balanced
medium: Slower, most accurate

The model is downloaded on first use to ~/.cache/moonshine/.Reference: config.py:83-86, stt.py, README.md:9

STT language not recognized

Moonshine supports multiple languages.Default: en (English)To change language:

stt_moonshine_language = "es"  # Spanish
# See Moonshine documentation for supported language codes

Reference: config.py:85

STT processing is slow

Moonshine runs locally on CPU.Expected latency: ~300ms for medium modelIf slower:

Use smaller model:

stt_moonshine_model = "small"  # or "tiny"

Close CPU-intensive apps to free resources
Check CPU usage during transcription:
- Activity Monitor (macOS)
- Task Manager (Windows)

Reference: README.md:9, stt.py

Audio Device Switching

Live microphone switch failed

If Klaus shows an error when switching mics:

Failed to switch microphone: <error>

Klaus automatically rolls back to the previous mic.Causes:

New mic in use by another app
New mic requires driver not installed
Invalid device index

Solution:

Close apps using the mic
Reconnect the mic
Select a different mic from the list

Reference: device_switch.py, main.py:856-867

VAD stream suspend/resume errors

Klaus suspends the VAD mic stream before TTS playback and resumes after.If you see errors:

Failed to reopen mic stream: <error>

The mic became unavailable during TTS playback.Causes:

USB mic disconnected
System switched default device
Mic claimed by another app

Recovery:

Klaus logs the error but continues
Reconnect the mic
Toggle input mode (switches to PTT and back) to reinitialize VAD

Reference: audio.py:245-276

Advanced Audio Debugging

sounddevice backend issues

Klaus uses sounddevice which wraps PortAudio.On macOS: Core Audio backend On Windows: WASAPI backend On Linux: ALSA backendTo debug backend issues:

python3 -c "import sounddevice as sd; print(sd.query_hostapis())"

Common issues:

Sample rate mismatch: Ensure device supports 16 kHz (mic) and 24 kHz (speaker)
Exclusive mode: Close apps with exclusive audio access
ASIO drivers (Windows): Not supported by Klaus; use WASAPI-compatible devices

Reference: audio.py

Enable debug-level audio logs

Edit ~/.klaus/config.toml:

log_level = "DEBUG"

Restart Klaus. Audio logs will include:

VAD frame-by-frame voiced/unvoiced decisions
RMS levels
Discard reasons
Stream lifecycle events

Warning: DEBUG mode is very verbose.Reference: config.py:115-117

Get Started

Setup & Installation

User Guide

Configuration

Architecture

Troubleshooting

Audio Issues

Microphone Issues

Voice Activation (VAD) Issues

Push-to-Talk Issues

Text-to-Speech (TTS) Issues

Speech-to-Text (STT) Issues

Audio Device Switching

Advanced Audio Debugging

Build docs developers (and LLMs) love

Get Started

Setup & Installation

User Guide

Configuration

Architecture

Troubleshooting

​Microphone Issues

​Voice Activation (VAD) Issues

​Push-to-Talk Issues

​Text-to-Speech (TTS) Issues

​Speech-to-Text (STT) Issues

​Audio Device Switching

​Advanced Audio Debugging

Build docs developers (and LLMs) love

Microphone Issues

Voice Activation (VAD) Issues

Push-to-Talk Issues

Text-to-Speech (TTS) Issues

Speech-to-Text (STT) Issues

Audio Device Switching

Advanced Audio Debugging