Dialogue

Overview

The text_to_dialogue client converts a list of text and voice ID pairs into natural-sounding dialogue with multiple voices. Perfect for creating conversations, audiobooks with multiple characters, or podcasts.

convert()

Generate dialogue from text and voice pairs.

Method Signature

client.text_to_dialogue.convert(
    inputs: Sequence[DialogueInput],
    output_format: Optional[TextToDialogueConvertRequestOutputFormat] = None,
    model_id: Optional[str] = None,
    language_code: Optional[str] = None,
    settings: Optional[ModelSettingsResponseModel] = None,
    pronunciation_dictionary_locators: Optional[Sequence[PronunciationDictionaryVersionLocator]] = None,
    seed: Optional[int] = None,
    apply_text_normalization: Optional[str] = None,
    request_options: Optional[RequestOptions] = None
) -> Iterator[bytes]

Parameters

inputs

Sequence[DialogueInput]

required

A list of dialogue inputs, each containing text and a voice ID. Maximum of 10 unique voice IDs.Each DialogueInput has:

text (str): The text to convert to speech
voice_id (str): The voice ID to use for this text

output_format

TextToDialogueConvertRequestOutputFormat

Output format of the generated audio. Formatted as codec_sample_rate_bitrate.Examples:

mp3_22050_32 - MP3 with 22.05kHz sample rate at 32kbps
mp3_44100_192 - MP3 with 44.1kHz at 192kbps (requires Creator tier+)
pcm_44100 - PCM with 44.1kHz (requires Pro tier+)
wav_44100 - WAV with 44.1kHz (requires Pro tier+)

model_id

str

The ID of the TTS model to use for generation.

language_code

str

Language code for the text. Auto-detected if not specified.

settings

ModelSettingsResponseModel

Model-specific settings for voice generation.

pronunciation_dictionary_locators

Sequence[PronunciationDictionaryVersionLocator]

Pronunciation dictionaries to use for custom pronunciations.

seed

int

Random seed for deterministic generation.

apply_text_normalization

str

Whether to apply text normalization. Options: “auto”, “on”, “off”.

request_options

RequestOptions

Request-specific configuration.

Returns

Iterator[bytes] - Streaming audio data containing the complete dialogue.

Example

from elevenlabs import ElevenLabs
from elevenlabs.types import DialogueInput

client = ElevenLabs(api_key="YOUR_API_KEY")

# Create a dialogue between two characters
dialogue = [
    DialogueInput(
        text="Hello! How are you doing today?",
        voice_id="21m00Tcm4TlvDq8ikWAM"  # Rachel
    ),
    DialogueInput(
        text="I'm doing great, thanks for asking!",
        voice_id="AZnzlk1XvdvUeBnXmlld"  # Domi
    ),
    DialogueInput(
        text="That's wonderful to hear!",
        voice_id="21m00Tcm4TlvDq8ikWAM"  # Rachel
    )
]

# Generate the dialogue
audio = client.text_to_dialogue.convert(
    inputs=dialogue,
    model_id="eleven_multilingual_v2"
)

# Save to file
with open("dialogue.mp3", "wb") as f:
    for chunk in audio:
        f.write(chunk)

convert_with_timestamps()

Generate dialogue with timestamp information for each segment.

audio_with_timestamps = client.text_to_dialogue.convert_with_timestamps(
    inputs=dialogue,
    model_id="eleven_multilingual_v2"
)

Returns AudioWithTimestampsAndVoiceSegmentsResponseModel containing:

Audio data
Timestamps for each voice segment
Voice segment information

stream()

Stream dialogue generation in real-time.

audio_stream = client.text_to_dialogue.stream(
    inputs=dialogue,
    model_id="eleven_multilingual_v2"
)

# Process chunks as they arrive
for chunk in audio_stream:
    # Play or process chunk
    pass

Async Usage

from elevenlabs import AsyncElevenLabs
import asyncio

async def generate_dialogue():
    client = AsyncElevenLabs(api_key="YOUR_API_KEY")
    
    dialogue = [
        {"text": "Hi there!", "voice_id": "voice_1"},
        {"text": "Hello!", "voice_id": "voice_2"}
    ]
    
    audio = await client.text_to_dialogue.convert(inputs=dialogue)
    
    chunks = []
    async for chunk in audio:
        chunks.append(chunk)
    
    return b"".join(chunks)

audio_data = asyncio.run(generate_dialogue())

Use Cases

Audiobooks

Create multi-character audiobooks with distinct voices for each character

Podcasts

Generate podcast episodes with multiple hosts or guests

Training Materials

Create engaging training content with conversational formats

Interactive Stories

Build interactive narratives with character dialogue

Text to Speech - Single voice generation
Voices - Browse available voices

Client

Text to Speech

Voices

Conversational AI

Audio Processing

History & Models

Account & Usage

Overview

convert()

Method Signature

Parameters

Returns

Example

convert_with_timestamps()

stream()

Async Usage

Use Cases

Audiobooks

Podcasts

Training Materials

Interactive Stories

Build docs developers (and LLMs) love

Client

Text to Speech

Voices

Conversational AI

Audio Processing

History & Models

Account & Usage

​Overview

​convert()

​Method Signature

​Parameters

​Returns

​Example

​convert_with_timestamps()

​stream()

​Async Usage

​Use Cases

Audiobooks

Podcasts

Training Materials

Interactive Stories

​Related Methods

Build docs developers (and LLMs) love

Overview

convert()

Method Signature

Parameters

Returns

Example

convert_with_timestamps()

stream()

Async Usage

Use Cases

Related Methods