Skip to main content

Prerequisites

Before you begin, make sure you have:

Your First Text-to-Speech

Let’s generate high-quality speech audio from text in just a few lines of code.
1

Set Up Your Environment

Create a new Python file and set up your environment variables:
from dotenv import load_dotenv
from elevenlabs.client import ElevenLabs
from elevenlabs.play import play

# Load your API key from .env file
load_dotenv()
Make sure you have a .env file with your API key: ELEVENLABS_API_KEY=your_api_key_here
2

Initialize the Client

Create an ElevenLabs client instance:
elevenlabs = ElevenLabs()
The client will automatically use the ELEVENLABS_API_KEY from your environment variables.
3

Generate Audio

Convert text to speech using the text_to_speech.convert() method:
audio = elevenlabs.text_to_speech.convert(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_v3",
    output_format="mp3_44100_128",
)
This generates audio using the Eleven v3 model with a pre-configured voice.
4

Play the Audio

Play the generated audio directly:
play(audio)
The play() function requires the PyAudio package. Install it with: pip install elevenlabs[pyaudio]

Complete Example

Here’s the complete working example:
from dotenv import load_dotenv
from elevenlabs.client import ElevenLabs
from elevenlabs.play import play

load_dotenv()

elevenlabs = ElevenLabs()

audio = elevenlabs.text_to_speech.convert(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_v3",
    output_format="mp3_44100_128",
)

play(audio)

Understanding the Code

Model Selection

The SDK supports multiple AI models optimized for different use cases:
audio = elevenlabs.text_to_speech.convert(
    text="Your text here",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_v3",  # Dramatic delivery, 70+ languages
    output_format="mp3_44100_128",
)

Voice IDs

The voice_id parameter specifies which voice to use. In the example, we use "JBFqnCBsd6RMkjVDRZzb" which is a pre-made voice. You can:
  • Browse available voices in the Voice Lab
  • List all your voices programmatically (see below)
  • Create custom voice clones

Output Formats

Supported output formats include:
  • mp3_44100_128 - MP3 at 44.1kHz, 128kbps
  • mp3_44100_192 - MP3 at 44.1kHz, 192kbps
  • pcm_16000 - Raw PCM at 16kHz
  • pcm_22050 - Raw PCM at 22.05kHz
  • pcm_24000 - Raw PCM at 24kHz
  • pcm_44100 - Raw PCM at 44.1kHz

Exploring Available Voices

List all available voices in your account:
from elevenlabs.client import ElevenLabs

elevenlabs = ElevenLabs()

response = elevenlabs.voices.search()
for voice in response.voices:
    print(f"Voice: {voice.name}, ID: {voice.voice_id}")

Saving Audio to a File

Instead of playing audio, save it to a file:
from elevenlabs.client import ElevenLabs

elevenlabs = ElevenLabs()

audio = elevenlabs.text_to_speech.convert(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_v3",
    output_format="mp3_44100_128",
)

# Save to file
with open("output.mp3", "wb") as f:
    for chunk in audio:
        if isinstance(chunk, bytes):
            f.write(chunk)

Streaming Audio

For real-time applications, stream audio as it’s being generated:
from elevenlabs import stream
from elevenlabs.client import ElevenLabs

elevenlabs = ElevenLabs()

audio_stream = elevenlabs.text_to_speech.stream(
    text="This is a streaming test.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2"
)

# Option 1: Play the stream
stream(audio_stream)

# Option 2: Process chunks manually
for chunk in audio_stream:
    if isinstance(chunk, bytes):
        # Process audio chunk
        print(f"Received {len(chunk)} bytes")
Streaming requires a stable internet connection. Consider implementing retry logic for production applications.

Using Async Client

For async/await support, use AsyncElevenLabs:
import asyncio
from elevenlabs.client import AsyncElevenLabs

elevenlabs = AsyncElevenLabs()

async def generate_speech():
    audio = await elevenlabs.text_to_speech.convert(
        text="Async text-to-speech example.",
        voice_id="JBFqnCBsd6RMkjVDRZzb",
        model_id="eleven_v3",
        output_format="mp3_44100_128",
    )
    return audio

asyncio.run(generate_speech())

Next Steps

Now that you’ve generated your first audio, explore more features:

Voice Cloning

Clone voices from audio samples

Conversational AI

Build interactive AI agents

Speech-to-Text

Transcribe audio with high accuracy

API Reference

Explore the complete API documentation

Troubleshooting

Audio Not Playing

If the play() function doesn’t work:
  1. Ensure PyAudio is installed: pip install elevenlabs[pyaudio]
  2. Check your system audio settings
  3. Try saving to a file first to verify the audio was generated

Authentication Errors

If you see authentication errors:
  1. Verify your API key is correct
  2. Check that the .env file is in the correct directory
  3. Ensure python-dotenv is installed: pip install python-dotenv

Model Not Available

If a model isn’t available:
  1. Check your subscription plan
  2. Verify the model ID is correct
  3. See available models for your plan

Build docs developers (and LLMs) love