Basic Usage
Convert text to speech with theconvert() method:
Output Formats
The SDK supports various audio output formats formatted ascodec_sample_rate_bitrate:
- MP3 with 192kbps requires Creator tier or above
- PCM and WAV with 44.1kHz require Pro tier or above
- μ-law format is commonly used for Twilio audio inputs
Model Selection
Choose from different models optimized for various use cases:Voice Settings
Customize voice characteristics with theVoiceSettings parameter:
Advanced Options
Latency Optimization
Optimize streaming latency at some cost of quality:0- Default mode (no latency optimizations)1- Normal optimizations (~50% improvement)2- Strong optimizations (~75% improvement)3- Max latency optimizations4- Max with text normalizer off (best latency, may mispronounce numbers/dates)
Language Control
Enforce a specific language using ISO 639-1 codes:Deterministic Generation
Use a seed for reproducible results:Context for Continuity
Improve speech continuity when generating multiple clips:Saving Audio Files
Save generated audio to a file:Timestamps
Get character-level timing information for audio-text synchronization:Async Usage
Use the async client for non-blocking operations:Next Steps
Learn how to stream audio in real-time for lower latency