Introduction
The Model API provides the main interface for working with Qwen3-TTS models. TheQwen3TTSModel class wraps the underlying HuggingFace model and provides three generation methods for different use cases:
- CustomVoice: Generate speech using predefined speaker IDs
- VoiceDesign: Generate speech with natural-language style instructions
- Base: Generate speech by cloning voices from reference audio
Main Components
Qwen3TTSModel
The primary class for model interaction. See Qwen3TTSModel for details.Model Loading
Load models using thefrom_pretrained() class method, which supports:
- HuggingFace Hub repository IDs
- Local model directories
- Standard HuggingFace loading options (device_map, dtype, etc.)
Generation Methods
The Model API provides three generation methods, each designed for a specific model type:generate_custom_voice
Generate speech using predefined speaker IDs with optional style instructions
generate_voice_design
Generate speech with natural-language voice descriptions
generate_voice_clone
Clone voices from reference audio samples
Voice Cloning
For voice cloning workflows, use:create_voice_clone_prompt()- Build reusable voice prompts from reference audioVoiceClonePromptItem- Container for voice clone prompt data
Utility Methods
Query model capabilities:Quick Start
CustomVoice Model
VoiceDesign Model
Base Model (Voice Cloning)
Next Steps
Qwen3TTSModel
Class documentation and initialization methods
Generation Methods
Complete parameter reference for all generation methods
Voice Clone Prompt
Voice cloning workflow and prompt management
Examples
Complete usage examples and tutorials