Overview
Klaus uses OpenAI’s gpt-4o-mini-tts model for high-quality neural text-to-speech. Responses are batched into sentences and streamed for low-latency playback.TextToSpeech Class
TheTextToSpeech class converts text to speech with sentence-level streaming and persistent audio output.
Constructor
Optional runtime settings. If
None, reads from config.get_runtime_settings().Methods
speak(text: str, on_sentence_start: callable = None) -> None
Synthesize and play text. Batches into sentences for low-latency playback.
The full text to speak
Optional callback
(sentence_index: int, sentence_text: str) fired before each sentence begins playbackspeak_streaming(sentence_queue: queue.Queue[str | None]) -> None
Play sentences as they arrive from a queue. Reads sentences from sentence_queue, synthesizes each via the API, and plays them sequentially. None in the queue signals completion.
Queue of sentences to synthesize and play. Push
None to signal end of stream.synthesize_to_wav(text: str) -> bytes
Synthesize text to a single WAV buffer without playing it.
Text to synthesize
stop() -> None
Immediately stop playback and close the audio stream.
Example:
reload_client(settings: config.RuntimeSettings | None = None) -> None
Recreate the OpenAI client to pick up API key changes from config.reload().
New runtime settings. If
None, reads from config.get_runtime_settings().Configuration
TTS settings are configured in~/.klaus/config.toml:
Available Voices
alloyashballadcoralcedar(default)sageshimmerverse
Implementation Details
- Sentence batching: Responses are split on sentence boundaries (
.,!,?) and synthesized in chunks up to 4000 characters. - Streaming playback: Audio begins playing as soon as the first sentence is synthesized.
- Persistent output stream: A single
sounddevice.OutputStreamis reused across chunks to avoid macOS CoreAudio crackling. - High latency mode on macOS: Uses
latency='high'on macOS for stable playback. - Thread-safe: Synthesis and playback run in background threads.
Constants
Source Reference
Seeklaus/tts.py for the full implementation.