Skip to main content
This guide walks you through generating speech with all three Chatterbox models: Turbo, Original, and Multilingual.

Choose Your Model

Generate Your First Audio

1

Import the library

Import the necessary modules based on which model you want to use:
import torchaudio as ta
import torch
from chatterbox.tts_turbo import ChatterboxTurboTTS
2

Load the model

Initialize the model with automatic device detection:
# Load the Turbo model
model = ChatterboxTurboTTS.from_pretrained(device="cuda")
3

Generate speech

Create audio from text:
# Generate with Paralinguistic Tags
text = "Oh, that's hilarious! [chuckle] Um anyway, we do have a new model in store."

# Generate audio
wav = model.generate(text)
Turbo supports paralinguistic tags like [laugh], [chuckle], [cough] to add natural expressions to speech.
4

Save the audio

Save the generated audio to a file:
ta.save("output-turbo.wav", wav, model.sr)

Voice Cloning

All models support zero-shot voice cloning using a reference audio file. Provide an audio prompt to clone any voice:
# Generate with voice cloning
text = "Hi there, Sarah here from MochaFone calling you back [chuckle]"
wav = model.generate(text, audio_prompt_path="your_10s_ref_clip.wav")
ta.save("cloned-voice.wav", wav, model.sr)
For best results, use a reference audio clip that is 5-10 seconds long with clear speech and minimal background noise.

Complete Examples

Here are complete working examples for each model:
import torchaudio as ta
import torch
from chatterbox.tts_turbo import ChatterboxTurboTTS

# Load the Turbo model
model = ChatterboxTurboTTS.from_pretrained(device="cuda")

# Generate with Paralinguistic Tags
text = "Oh, that's hilarious! [chuckle] Um anyway, we do have a new model in store. It's the SkyNet T-800 series and it's got basically everything. Including AI integration with ChatGPT and all that jazz. Would you like me to get some prices for you?"

# Generate audio
wav = model.generate(text)
ta.save("test-turbo.wav", wav, model.sr)

Supported Languages (Multilingual Model)

The Multilingual model supports 23+ languages: Arabic (ar) • Danish (da) • German (de) • Greek (el) • English (en) • Spanish (es) • Finnish (fi) • French (fr) • Hebrew (he) • Hindi (hi) • Italian (it) • Japanese (ja) • Korean (ko) • Malay (ms) • Dutch (nl) • Norwegian (no) • Polish (pl) • Portuguese (pt) • Russian (ru) • Swedish (sv) • Swahili (sw) • Turkish (tr) • Chinese (zh)

Next Steps

  • Learn about advanced features and tuning parameters
  • Explore paralinguistic tags for Turbo
  • Understand voice cloning best practices
  • Check out the watermarking features for responsible AI

Build docs developers (and LLMs) love