Skip to main content
ElevenLabs offers multiple state-of-the-art TTS models, each optimized for different use cases, languages, and performance requirements.

Listing Available Models

Retrieve all available models for your account:
from elevenlabs.client import ElevenLabs

client = ElevenLabs(
    api_key="YOUR_API_KEY"
)

models = client.models.list()

for model in models:
    print(f"Model: {model.model_id}")
    print(f"Name: {model.name}")
    print(f"Languages: {len(model.languages)} supported")
    print(f"Can do TTS: {model.can_do_text_to_speech}")
    print("---")

Main Models Overview

ElevenLabs provides four main TTS models, each with unique characteristics:

Eleven v3

Model ID: eleven_v3Dramatic delivery and performances with support for 70+ languages and natural multi-speaker dialogue.

Eleven Multilingual v2

Model ID: eleven_multilingual_v2Excels in stability, language diversity, and accent accuracy across 29 languages. Recommended for most use cases.

Eleven Flash v2.5

Model ID: eleven_flash_v2_5Ultra-low latency with support for 32 languages. Faster model at 50% lower price per character.

Eleven Turbo v2.5

Model ID: eleven_turbo_v2_5Good balance of quality and latency, ideal for developer use cases where speed is crucial. Supports 32 languages.

Eleven v3

The latest generation model with dramatic performances and extensive language support.
audio = client.text_to_speech.convert(
    text="Experience the most dramatic and lifelike performances.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_v3",
    output_format="mp3_44100_128"
)

Key Features

Eleven v3 excels at emotional expression and dramatic performances, making it ideal for:
  • Audiobook narration
  • Character voices
  • Storytelling
  • Expressive dialogue
The most multilingual model with support for over 70 languages, providing global reach for your applications.
Natural multi-speaker dialogue capabilities for:
  • Conversational AI
  • Podcast generation
  • Interview simulations
  • Interactive narratives
Eleven v3 may have higher latency compared to Flash or Turbo models. Not recommended for real-time streaming applications where low latency is critical.

Eleven Multilingual v2

The recommended model for most production use cases, balancing quality, stability, and language support.
audio = client.text_to_speech.convert(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128"
)

Key Features

  • Stability: Consistent voice quality across generations
  • Accuracy: Excellent accent and pronunciation accuracy
  • 29 Languages: Broad language support for global applications
  • Reliability: Proven performance in production environments

Use Cases

Content Creation

  • Video voiceovers
  • E-learning modules
  • Marketing materials
  • Product demos

Applications

  • Mobile apps
  • Web applications
  • IVR systems
  • Accessibility tools

Eleven Flash v2.5

Ultra-low latency model optimized for speed and cost efficiency.
audio = client.text_to_speech.convert(
    text="Fast generation with minimal latency.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_flash_v2_5",
    output_format="mp3_22050_32"  # Lower format for even faster streaming
)

Key Features

  • Ultra-Low Latency: Fastest generation times
  • Cost Effective: 50% lower price per character
  • 32 Languages: Broad language support
  • Optimized for Streaming: Ideal for real-time applications

Best For

1

Real-Time Streaming

Low-latency audio streaming for conversational AI and live applications
2

High-Volume Processing

Batch processing large amounts of text with cost constraints
3

Prototyping

Rapid development and testing with quick iteration cycles
# Optimized for streaming
audio_stream = client.text_to_speech.stream(
    text="Real-time streaming with Flash v2.5",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_flash_v2_5",
    optimize_streaming_latency=4
)

Eleven Turbo v2.5

Balanced model providing excellent quality-to-speed ratio for developer applications.
audio = client.text_to_speech.convert(
    text="Great balance of quality and latency.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_turbo_v2_5",
    output_format="mp3_44100_128"
)

Key Features

  • Balanced Performance: Good quality with reduced latency
  • Developer-Focused: Optimized for common development scenarios
  • 32 Languages: Wide language coverage
  • Versatile: Suitable for most application types

Ideal Use Cases

  • Conversational AI: Chatbots and virtual assistants
  • Gaming: Character dialogue and narration
  • Notifications: Audio alerts and announcements
  • Interactive Systems: IVR and phone systems

Model Comparison

Model Characteristics
comparison
Featurev3Multilingual v2Flash v2.5Turbo v2.5
QualityHighestHighGoodHigh
LatencyHigherMediumLowestLow
Languages70+293232
PriceStandardStandard50% lowerStandard
Best ForDramatic contentProductionReal-time/CostDeveloper apps
Multi-speakerYesLimitedNoLimited

Choosing the Right Model

1

Assess Your Requirements

Determine your priorities: quality, latency, cost, or language support
2

Consider Your Use Case

  • Content creation → Multilingual v2 or v3
  • Real-time streaming → Flash v2.5 or Turbo v2.5
  • High-volume processing → Flash v2.5
  • Dramatic narration → v3
3

Test and Compare

models_to_test = [
    "eleven_v3",
    "eleven_multilingual_v2",
    "eleven_flash_v2_5",
    "eleven_turbo_v2_5"
]

for model_id in models_to_test:
    audio = client.text_to_speech.convert(
        text="Compare model outputs.",
        voice_id="JBFqnCBsd6RMkjVDRZzb",
        model_id=model_id
    )
    # Save and compare
    save(audio, f"output_{model_id}.mp3")

Language Support

Check which languages a model supports:
models = client.models.list()

for model in models:
    if model.model_id == "eleven_v3":
        print(f"Languages supported by {model.name}:")
        for lang in model.languages:
            print(f"  - {lang.name} ({lang.language_id})")

Model-Specific Settings

Some models support additional parameters:
# Using language code enforcement
audio = client.text_to_speech.convert(
    text="Bonjour le monde",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2",
    language_code="fr"  # Enforce French
)

# Optimizing for streaming with Turbo
audio_stream = client.text_to_speech.stream(
    text="Optimized streaming.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_turbo_v2_5",
    optimize_streaming_latency=3
)

Async Model Operations

List models asynchronously:
import asyncio
from elevenlabs.client import AsyncElevenLabs

client = AsyncElevenLabs(
    api_key="YOUR_API_KEY"
)

async def print_models():
    models = await client.models.list()
    for model in models:
        print(f"{model.name}: {model.model_id}")

asyncio.run(print_models())

Best Practices

  • Use Multilingual v2 for stable, production-ready applications
  • Test with your actual content before committing to a model
  • Monitor usage and costs across different models
  • Consider fallback strategies if a model is unavailable
  • Choose Flash v2.5 or Turbo v2.5 for low-latency requirements
  • Combine with optimize_streaming_latency parameter
  • Use lower output formats (22050Hz, 32kbps) to reduce bandwidth
  • Implement proper error handling and retry logic
  • Use Flash v2.5 for high-volume processing (50% cost savings)
  • Cache generated audio when possible
  • Monitor character usage across models
  • Batch requests when real-time isn’t required
  • Use v3 for the highest quality and dramatic expression
  • Use higher output formats (44100Hz, 128kbps+)
  • Fine-tune voice settings for each model
  • Test different models with your specific content

Additional Resources

Models Documentation

Detailed information about all models and languages

Voice Lab

Try different models with various voices
For the most up-to-date information about model capabilities, pricing, and language support, visit the ElevenLabs Models documentation.

Build docs developers (and LLMs) love