PushToTalkRecorder
Records audio while a key is held, producing a WAV buffer on release.Constructor
Audio sample rate in Hz
Methods
start_recording()
Begin capturing audio from the microphone.stop_recording()
Stop recording and return WAV bytes, or None if nothing was captured.WAV-encoded audio data, or None if no audio was captured
Properties
Whether the recorder is currently capturing audio
VoiceActivatedRecorder
Continuously listens and uses webrtcvad to detect speech boundaries.Constructor
Callback invoked when speech begins
Callback invoked with WAV bytes when speech ends
Callback invoked with reason when utterance is discarded
Audio sample rate in Hz
WebRTC VAD sensitivity (0-3, higher = more aggressive filtering)
Seconds of silence before finalizing speech
Minimum ratio of voiced frames to total frames
Minimum number of voiced frames required
Minimum utterance duration in seconds
Minimum RMS loudness in dBFS
Minimum contiguous voiced run of 30ms frames
Input device index, or None for system default
Methods
start()
Open the mic stream and begin VAD detection.stop()
Stop the stream and discard any in-progress speech.pause()
Pause detection (e.g. while Klaus is speaking).resume()
Resume detection after pause.suspend_stream()
Stop the physical mic stream. Safe to call from non-callback threads. Use this (instead of pause) when you need to free the CoreAudio device, e.g. before TTS playback. Callresume_stream() to reopen.
resume_stream()
Reopen the physical mic stream aftersuspend_stream().
Properties
Whether the VAD is actively listening
Whether detection is paused
AudioPlayer
Plays raw PCM or WAV audio through the default output device.Constructor
Audio sample rate in Hz
Methods
play_wav_bytes()
Play a complete WAV buffer. Blocks until playback finishes orstop() is called.
WAV-encoded audio data
stop()
Stop playback immediately.Utility Functions
to_wav_bytes()
Convert int16 numpy audio to WAV bytes.Audio samples as int16 numpy array
Sample rate in Hz
WAV-encoded audio data