Overview
Thetext_to_dialogue client converts a list of text and voice ID pairs into natural-sounding dialogue with multiple voices. Perfect for creating conversations, audiobooks with multiple characters, or podcasts.
convert()
Generate dialogue from text and voice pairs.Method Signature
Parameters
A list of dialogue inputs, each containing text and a voice ID. Maximum of 10 unique voice IDs.Each
DialogueInput has:text(str): The text to convert to speechvoice_id(str): The voice ID to use for this text
Output format of the generated audio. Formatted as codec_sample_rate_bitrate.Examples:
mp3_22050_32- MP3 with 22.05kHz sample rate at 32kbpsmp3_44100_192- MP3 with 44.1kHz at 192kbps (requires Creator tier+)pcm_44100- PCM with 44.1kHz (requires Pro tier+)wav_44100- WAV with 44.1kHz (requires Pro tier+)
The ID of the TTS model to use for generation.
Language code for the text. Auto-detected if not specified.
Model-specific settings for voice generation.
Pronunciation dictionaries to use for custom pronunciations.
Random seed for deterministic generation.
Whether to apply text normalization. Options: “auto”, “on”, “off”.
Request-specific configuration.
Returns
Iterator[bytes] - Streaming audio data containing the complete dialogue.
Example
convert_with_timestamps()
Generate dialogue with timestamp information for each segment.AudioWithTimestampsAndVoiceSegmentsResponseModel containing:
- Audio data
- Timestamps for each voice segment
- Voice segment information
stream()
Stream dialogue generation in real-time.Async Usage
Use Cases
Audiobooks
Create multi-character audiobooks with distinct voices for each character
Podcasts
Generate podcast episodes with multiple hosts or guests
Training Materials
Create engaging training content with conversational formats
Interactive Stories
Build interactive narratives with character dialogue
Related Methods
- Text to Speech - Single voice generation
- Voices - Browse available voices