Overview
Streaming TTS generates speech incrementally, delivering audio chunks as they are produced. This enables lower time-to-first-byte and immediate playback while synthesis continues.Quick Start
Built-in PCM Player
Use the native PCM player for minimal latency:Engine Creation
Create a streaming TTS engine (same as batch TTS):Generate Speech Stream
Generation Options
Same as batch TTS:Stream Handlers
Chunk Event
End Event
Error Event
Stream Controller
The controller manages the streaming generation:Cancel Generation
Unsubscribe Listeners
PCM Player API
Start Player
Write Chunks
Stop Player
Complete Example
Recording Streamed Audio
Accumulate chunks to save after generation:Voice Cloning (Pocket TTS)
Stream with voice cloning for Kotlin-engine models:generateSpeech for ZipVoice voice cloning.
Multiple Concurrent Requests
Only one stream per engine is allowed at a time. For concurrent requests:Option A: Sequential
Wait foronEnd before starting the next:
Option B: Multiple Engines
Create separate engines:Performance Tips
Threading
Chunk Size
Control viamaxNumSentences:
Memory
- Avoid accumulating all chunks in JS for very long texts
- Use native player to minimize JS memory usage
- Save incrementally to files if needed
Error Handling
Cleanup
Always clean up resources:onEnd or onError. Call controller.unsubscribe() manually only if discarding the controller before completion.
Supported Models
All TTS model types support streaming:- VITS (Piper)
- Matcha
- Kokoro
- Kitten
- ZipVoice (batch
generateSpeechonly for voice cloning)
Next Steps
Batch TTS
Generate complete audio buffers
Model Setup
Download and configure TTS models