Skip to main content

Overview

The Conditionals dataclass stores voice and style conditioning information used by Chatterbox TTS models. It encapsulates both T3 (text-to-tokens) conditionals and S3Gen (tokens-to-audio) conditionals required for voice cloning and speech generation.

Class Signature

@dataclass
class Conditionals:
    t3: T3Cond
    gen: dict

Attributes

t3
T3Cond
required
T3 model conditionals containing:
  • speaker_emb: Voice encoder speaker embedding
  • clap_emb: Optional CLAP audio-text embedding
  • cond_prompt_speech_tokens: Speech tokens from reference audio
  • cond_prompt_speech_emb: Speech embeddings from reference audio
  • emotion_adv: Exaggeration level for expressive speech (0.0 to 1.0+)
gen
dict
required
S3Gen model conditionals dictionary containing:
  • prompt_token: Reference audio tokens
  • prompt_token_len: Length of reference tokens
  • prompt_feat: Reference audio features
  • prompt_feat_len: Length of reference features
  • embedding: Voice embedding for generation

Methods

to()

Move conditionals to a specified device.
def to(self, device: str) -> Conditionals

Parameters

device
str
required
Target device (“cuda”, “cpu”, or “mps”)

Returns

conditionals
Conditionals
The conditionals object with all tensors moved to the specified device

Example

# Move conditionals to GPU
conds = conds.to("cuda")

# Move conditionals to CPU
conds = conds.to("cpu")

save()

Save conditionals to a file for later reuse.
def save(self, fpath: Path)

Parameters

fpath
Path
required
Path where the conditionals will be saved as a .pt file

Example

from pathlib import Path

# Prepare conditionals from audio
model.prepare_conditionals("voice_sample.wav")

# Save for reuse
model.conds.save(Path("my_voice_conds.pt"))

load()

Load conditionals from a saved file.
@classmethod
def load(cls, fpath: Path, map_location: str = "cpu") -> Conditionals

Parameters

fpath
Path
required
Path to the saved conditionals .pt file
map_location
str
default:"cpu"
Device to load the conditionals onto (“cuda”, “cpu”, or “mps”)

Returns

conditionals
Conditionals
Loaded Conditionals object

Example

from pathlib import Path
from chatterbox.tts_turbo import Conditionals

# Load saved conditionals
conds = Conditionals.load(
    Path("my_voice_conds.pt"),
    map_location="cuda"
)

# Use with a model
model.conds = conds
audio = model.generate("Hello world!")

Usage Examples

Save and Reuse Voice Conditionals

from chatterbox import ChatterboxTurboTTS
from pathlib import Path
import torchaudio

device = "cuda"
model = ChatterboxTurboTTS.from_pretrained(device)

# Prepare and save conditionals
model.prepare_conditionals(
    wav_fpath="celebrity_voice.wav",
    exaggeration=0.5
)
model.conds.save(Path("celebrity_conds.pt"))

# Later, load and reuse without re-processing audio
from chatterbox.tts_turbo import Conditionals

model = ChatterboxTurboTTS.from_pretrained(device)
model.conds = Conditionals.load(Path("celebrity_conds.pt"), map_location=device)

# Generate multiple outputs with the saved voice
for i, text in enumerate(["Hello there!", "How are you?", "Nice to meet you."]):
    audio = model.generate(text)
    torchaudio.save(f"output_{i}.wav", audio, model.sr)

Transfer Conditionals Between Models

from chatterbox import ChatterboxTTS, ChatterboxTurboTTS
from chatterbox.tts import Conditionals
import torch

device = "cuda"

# Prepare conditionals with standard model
standard_model = ChatterboxTTS.from_pretrained(device)
standard_model.prepare_conditionals("voice.wav", exaggeration=0.7)

# Save conditionals
standard_model.conds.save("voice_conds.pt")

# Load into turbo model
turbo_model = ChatterboxTurboTTS.from_pretrained(device)
turbo_model.conds = Conditionals.load("voice_conds.pt", map_location=device)

# Generate with turbo model using standard model's voice
audio = turbo_model.generate("This uses the same voice!")

Move Conditionals Between Devices

from chatterbox import ChatterboxMultilingualTTS
import torch

# Prepare on CPU
model = ChatterboxMultilingualTTS.from_pretrained("cpu")
model.prepare_conditionals("voice.wav")

# Save conditionals
model.conds.save("cpu_conds.pt")

# Load on GPU
if torch.cuda.is_available():
    from chatterbox.mtl_tts import Conditionals
    
    gpu_model = ChatterboxMultilingualTTS.from_pretrained("cuda")
    gpu_model.conds = Conditionals.load(
        "cpu_conds.pt",
        map_location="cuda"
    )
    
    audio = gpu_model.generate("Now running on GPU!", language_id="en")

Notes

  • Conditionals are automatically created when you call prepare_conditionals() on a TTS model
  • Saved conditionals are portable and can be shared or reused across sessions
  • The same conditionals can be used with different models in the Chatterbox family (TTS, TurboTTS, MultilingualTTS)
  • Moving conditionals to a device is necessary before inference to ensure tensor compatibility
  • Conditionals files are typically small (a few MB) compared to model checkpoints
  • The emotion_adv parameter in T3Cond controls voice exaggeration and can be adjusted per generation

Build docs developers (and LLMs) love