Skip to main content
Streaming allows you to receive and play audio as it’s being generated, reducing perceived latency and enabling real-time applications.

Basic Streaming

Use the stream() method to get an iterator of audio chunks:
from elevenlabs import stream
from elevenlabs.client import ElevenLabs

client = ElevenLabs(
    api_key="YOUR_API_KEY"
)

audio_stream = client.text_to_speech.stream(
    text="This is a test of real-time streaming.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2"
)

# Play the stream directly
stream(audio_stream)

Stream Processing Options

You have two main ways to handle streamed audio:
from elevenlabs import stream

# Option 1: Use the stream() helper to play audio as it arrives
audio_stream = client.text_to_speech.stream(
    text="Play audio in real-time.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

stream(audio_stream)

Optimizing Streaming Latency

The optimize_streaming_latency parameter trades some quality for lower latency:
audio_stream = client.text_to_speech.stream(
    text="Ultra-low latency streaming.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_turbo_v2_5",
    optimize_streaming_latency=4  # Maximum optimization
)

stream(audio_stream)
optimize_streaming_latency
int
  • 0 - Default (no optimizations)
  • 1 - Normal (~50% latency reduction)
  • 2 - Strong (~75% latency reduction)
  • 3 - Maximum latency optimization
  • 4 - Maximum + text normalizer off (lowest latency, may affect pronunciation)
Combine eleven_turbo_v2_5 or eleven_flash_v2_5 models with high optimization levels for the best streaming performance.

Streaming with Output Formats

Specify the audio format for your stream:
audio_stream = client.text_to_speech.stream(
    text="Streaming in different formats.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    output_format="mp3_22050_32"  # Lower quality for faster streaming
)

stream(audio_stream)
Lower sample rates and bitrates reduce bandwidth and improve streaming speed but decrease audio quality.

Collecting Streamed Audio

Save the entire audio while streaming:
from elevenlabs.play import save

audio_stream = client.text_to_speech.stream(
    text="Stream and save simultaneously.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

# The stream() function returns the complete audio
complete_audio = stream(audio_stream)

# Save the complete audio
save(complete_audio, "streamed_output.mp3")

Custom Stream Handling

Implement custom logic for each audio chunk:
import io

audio_stream = client.text_to_speech.stream(
    text="Custom stream processing example.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_flash_v2_5"
)

audio_buffer = io.BytesIO()

for chunk in audio_stream:
    if isinstance(chunk, bytes):
        # Write to buffer
        audio_buffer.write(chunk)
        
        # Custom processing: send to websocket, analyze, etc.
        # websocket.send(chunk)
        
        # Monitor progress
        print(f"Buffer size: {audio_buffer.tell()} bytes")

# Get complete audio from buffer
audio_buffer.seek(0)
complete_audio = audio_buffer.read()

Streaming with Timestamps

Get timing information while streaming:
audio_stream = client.text_to_speech.stream_with_timestamps(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128"
)

for chunk in audio_stream:
    # Each chunk contains audio and character alignment data
    if hasattr(chunk, 'audio'):
        audio_bytes = chunk.audio
        alignment = chunk.alignment
        print(f"Audio chunk with {len(alignment)} alignment points")

Async Streaming

Stream audio asynchronously for better concurrency:
import asyncio
from elevenlabs.client import AsyncElevenLabs

client = AsyncElevenLabs(
    api_key="YOUR_API_KEY"
)

async def stream_audio():
    audio_stream = await client.text_to_speech.stream(
        text="Async streaming for concurrent operations.",
        voice_id="JBFqnCBsd6RMkjVDRZzb",
        model_id="eleven_turbo_v2_5"
    )
    
    async for chunk in audio_stream:
        if isinstance(chunk, bytes):
            # Process chunk asynchronously
            await process_audio_chunk(chunk)

async def process_audio_chunk(chunk: bytes):
    # Your async processing logic
    print(f"Processing {len(chunk)} bytes")

asyncio.run(stream_audio())

Playing Streamed Audio

The SDK provides multiple ways to play streamed audio:
from elevenlabs import stream

# Requires mpv installed (brew install mpv)
audio_stream = client.text_to_speech.stream(
    text="Playing with mpv.",
    voice_id="JBFqnCBsd6RMkjVDRZzb"
)

stream(audio_stream)
The stream() function requires mpv to be installed. Install it with:
  • macOS: brew install mpv
  • Linux/Windows: Download from mpv.io

Best Practices

  • Use eleven_flash_v2_5 for the fastest streaming
  • Use eleven_turbo_v2_5 for balanced quality and speed
  • Avoid eleven_v3 for streaming if latency is critical
  • Lower sample rates (22050Hz) reduce latency
  • Lower bitrates (32kbps) improve streaming speed
  • Balance quality needs with performance requirements
try:
    audio_stream = client.text_to_speech.stream(
        text="Handle streaming errors.",
        voice_id="JBFqnCBsd6RMkjVDRZzb"
    )
    stream(audio_stream)
except Exception as e:
    print(f"Streaming error: {e}")
For long streams, consider buffering to prevent memory issues:
chunks_buffer = []
max_buffer_size = 100

for chunk in audio_stream:
    chunks_buffer.append(chunk)
    if len(chunks_buffer) >= max_buffer_size:
        # Process or save buffered chunks
        process_chunks(chunks_buffer)
        chunks_buffer = []

Next Steps

Learn how to manage and customize voices

Build docs developers (and LLMs) love