Audio Types
GeneratedAudio
Generated audio data from TTS synthesis.Audio samples as an array of float values in range [-1.0, 1.0].This is raw PCM audio data that can be:
- Saved to a WAV file using
saveAudioToFile() - Played using an audio library
- Processed further for effects or analysis
Sample rate of the generated audio in Hz.Common values:
16000- 16 kHz (voice/telephony)22050- 22.05 kHz (common for TTS)44100- 44.1 kHz (CD quality)48000- 48 kHz (professional audio)
GeneratedAudioWithTimestamps
Generated audio with word-level timestamp metadata for subtitles.Array of subtitle/timestamp entries for each word or token.See TtsSubtitleItem below.
Whether timestamps are estimated rather than model-provided.
true- Timestamps are estimated/calculatedfalse- Timestamps come directly from the TTS model
TtsSubtitleItem
Subtitle/timestamp entry for a word or token in synthesized speech.Text token for this time range (word, phoneme, or character).
Start time in seconds.
End time in seconds.
Model Info Types
TTSModelInfo
Information about TTS model capabilities.Sample rate that the model generates audio at (in Hz).
Number of speakers/voices available in the model.
0or1= Single-speaker model> 1= Multi-speaker model
Engine Interfaces
TtsEngine
Batch TTS engine interface returned bycreateTTS().
createTTS() documentation for detailed method descriptions.
StreamingTtsEngine
Streaming TTS engine interface returned bycreateStreamingTTS().
createStreamingTTS() documentation for detailed method descriptions.
Streaming Types
TtsStreamHandlers
Event handlers for streaming TTS generation.Called for each audio chunk as it’s generated.
Called when generation completes or is cancelled.
Called when an error occurs during generation.
TtsStreamChunk
Audio chunk received during streaming generation.Instance ID of the TTS engine that generated this chunk.
Request ID for this specific generation (distinguishes concurrent streams).
Float PCM audio samples in range [-1.0, 1.0].
Sample rate of the audio in Hz.
Generation progress as a float between 0.0 and 1.0.
true if this is the last chunk in the stream.TtsStreamEnd
Event emitted when streaming generation ends.Instance ID of the TTS engine.
Request ID for this generation.
true if generation was cancelled, false if it completed normally.TtsStreamError
Event emitted when an error occurs during streaming.Instance ID of the TTS engine.
Request ID for this generation.
Error message describing what went wrong.
TtsStreamController
Controller returned bygenerateSpeechStream() for managing the stream.
Cancels the ongoing TTS generation.
Removes event listeners. Called automatically on end/error, or can be called manually.
Configuration Types
See the Configuration page for detailed documentation of:TTSInitializeOptionsTtsModelOptionsTtsVitsModelOptionsTtsMatchaModelOptionsTtsKokoroModelOptionsTtsKittenModelOptionsTtsPocketModelOptionsTtsUpdateOptionsTtsGenerationOptionsTTSModelType
Type Imports
Import all TTS types
Import runtime constants
See Also
- createTTS() - Batch TTS engine
- createStreamingTTS() - Streaming TTS engine
- Configuration - Detailed configuration options