Streaming

Streaming allows you to receive and play audio as it’s being generated, reducing perceived latency and enabling real-time applications.

Basic Streaming

Use the stream() method to get an iterator of audio chunks:

from elevenlabs import stream
from elevenlabs.client import ElevenLabs

client = ElevenLabs(
    api_key="YOUR_API_KEY"
)

audio_stream = client.text_to_speech.stream(
    text="This is a test of real-time streaming.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2"
)

# Play the stream directly
stream(audio_stream)

Stream Processing Options

You have two main ways to handle streamed audio:

from elevenlabs import stream

# Option 1: Use the stream() helper to play audio as it arrives
audio_stream = client.text_to_speech.stream(
    text="Play audio in real-time.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

stream(audio_stream)

Optimizing Streaming Latency

The optimize_streaming_latency parameter trades some quality for lower latency:

audio_stream = client.text_to_speech.stream(
    text="Ultra-low latency streaming.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_turbo_v2_5",
    optimize_streaming_latency=4  # Maximum optimization
)

stream(audio_stream)

optimize_streaming_latency

int

0 - Default (no optimizations)
1 - Normal (~50% latency reduction)
2 - Strong (~75% latency reduction)
3 - Maximum latency optimization
4 - Maximum + text normalizer off (lowest latency, may affect pronunciation)

Combine eleven_turbo_v2_5 or eleven_flash_v2_5 models with high optimization levels for the best streaming performance.

Streaming with Output Formats

Specify the audio format for your stream:

audio_stream = client.text_to_speech.stream(
    text="Streaming in different formats.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    output_format="mp3_22050_32"  # Lower quality for faster streaming
)

stream(audio_stream)

Lower sample rates and bitrates reduce bandwidth and improve streaming speed but decrease audio quality.

Collecting Streamed Audio

Save the entire audio while streaming:

from elevenlabs.play import save

audio_stream = client.text_to_speech.stream(
    text="Stream and save simultaneously.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

# The stream() function returns the complete audio
complete_audio = stream(audio_stream)

# Save the complete audio
save(complete_audio, "streamed_output.mp3")

Custom Stream Handling

Implement custom logic for each audio chunk:

import io

audio_stream = client.text_to_speech.stream(
    text="Custom stream processing example.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_flash_v2_5"
)

audio_buffer = io.BytesIO()

for chunk in audio_stream:
    if isinstance(chunk, bytes):
        # Write to buffer
        audio_buffer.write(chunk)
        
        # Custom processing: send to websocket, analyze, etc.
        # websocket.send(chunk)
        
        # Monitor progress
        print(f"Buffer size: {audio_buffer.tell()} bytes")

# Get complete audio from buffer
audio_buffer.seek(0)
complete_audio = audio_buffer.read()

Streaming with Timestamps

Get timing information while streaming:

audio_stream = client.text_to_speech.stream_with_timestamps(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128"
)

for chunk in audio_stream:
    # Each chunk contains audio and character alignment data
    if hasattr(chunk, 'audio'):
        audio_bytes = chunk.audio
        alignment = chunk.alignment
        print(f"Audio chunk with {len(alignment)} alignment points")

Async Streaming

Stream audio asynchronously for better concurrency:

import asyncio
from elevenlabs.client import AsyncElevenLabs

client = AsyncElevenLabs(
    api_key="YOUR_API_KEY"
)

async def stream_audio():
    audio_stream = await client.text_to_speech.stream(
        text="Async streaming for concurrent operations.",
        voice_id="JBFqnCBsd6RMkjVDRZzb",
        model_id="eleven_turbo_v2_5"
    )
    
    async for chunk in audio_stream:
        if isinstance(chunk, bytes):
            # Process chunk asynchronously
            await process_audio_chunk(chunk)

async def process_audio_chunk(chunk: bytes):
    # Your async processing logic
    print(f"Processing {len(chunk)} bytes")

asyncio.run(stream_audio())

Playing Streamed Audio

The SDK provides multiple ways to play streamed audio:

from elevenlabs import stream

# Requires mpv installed (brew install mpv)
audio_stream = client.text_to_speech.stream(
    text="Playing with mpv.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

stream(audio_stream)

The stream() function requires mpv to be installed. Install it with:

macOS: brew install mpv
Linux/Windows: Download from mpv.io

Best Practices

Choose the Right Model

Use eleven_flash_v2_5 for the fastest streaming
Use eleven_turbo_v2_5 for balanced quality and speed
Avoid eleven_v3 for streaming if latency is critical

Optimize Format

Lower sample rates (22050Hz) reduce latency
Lower bitrates (32kbps) improve streaming speed
Balance quality needs with performance requirements

Handle Errors Gracefully

try:
    audio_stream = client.text_to_speech.stream(
        text="Handle streaming errors.",
        voice_id="JBFqnCBsd6RMkjVDRZzb"
    )
    stream(audio_stream)
except Exception as e:
    print(f"Streaming error: {e}")

Buffer Management

For long streams, consider buffering to prevent memory issues:

chunks_buffer = []
max_buffer_size = 100

for chunk in audio_stream:
    chunks_buffer.append(chunk)
    if len(chunks_buffer) >= max_buffer_size:
        # Process or save buffered chunks
        process_chunks(chunks_buffer)
        chunks_buffer = []

Next Steps

Learn how to manage and customize voices

Getting Started

Core Features

Conversational AI

Advanced Features

Guides

Basic Streaming

Stream Processing Options

Optimizing Streaming Latency

Streaming with Output Formats

Collecting Streamed Audio

Custom Stream Handling

Streaming with Timestamps

Async Streaming

Playing Streamed Audio

Best Practices

Next Steps

Build docs developers (and LLMs) love

Getting Started

Core Features

Conversational AI

Advanced Features

Guides

​Basic Streaming

​Stream Processing Options

​Optimizing Streaming Latency

​Streaming with Output Formats

​Collecting Streamed Audio

​Custom Stream Handling

​Streaming with Timestamps

​Async Streaming

​Playing Streamed Audio

​Best Practices

Next Steps

Build docs developers (and LLMs) love

Basic Streaming

Stream Processing Options

Optimizing Streaming Latency

Streaming with Output Formats

Collecting Streamed Audio

Custom Stream Handling

Streaming with Timestamps

Async Streaming

Playing Streamed Audio

Best Practices