Class
Qwen3TTSModel is a HuggingFace-style wrapper for Qwen3-TTS models that provides:
from_pretrained()initialization via AutoModel/AutoProcessor- Generation APIs for CustomVoice, VoiceDesign, and Base models
- Consistent output format:
(wavs: List[np.ndarray], sample_rate: int) - Language and speaker validation
qwen_tts/inference/qwen3_tts_model.py:54-878
Constructor
from_pretrained() instead.
The underlying HuggingFace TTS model instance.
The model’s text processor.
Default generation parameters loaded from
generate_config.json.Class Methods
from_pretrained
from_pretrained style.
This method:
- Loads the config via
AutoConfig(registersqwen3_ttsmodel type) - Loads the model via
AutoModel.from_pretrained(...), forwardingkwargsunchanged - Loads the processor via
AutoProcessor.from_pretrained(...) - Loads optional
generate_config.jsonfrom the model directory if present
qwen_tts/inference/qwen3_tts_model.py:82-121
Parameters
HuggingFace repository ID or local directory path containing the model.Examples:
"Qwen3-TTS-CustomVoice-2B""./local/model/path"
Forwarded as-is into
AutoModel.from_pretrained(...).Common examples:device_map="cuda:0"- Load model on specific GPUtorch_dtype=torch.bfloat16- Use bfloat16 precisionattn_implementation="flash_attention_2"- Use FlashAttention 2trust_remote_code=True- Trust remote code (if required)
Returns
Wrapper instance containing the loaded model, processor, and generation defaults.
Example
Instance Methods
get_supported_languages
model.get_supported_languages(). If the underlying model does not expose language constraints (returns None), this method also returns None.
Source: qwen_tts/inference/qwen3_tts_model.py:861-877
Returns
- A sorted list of supported language names (lowercased), if available
Noneif the model does not provide supported languages
Example
get_supported_speakers
model.get_supported_speakers(). If the underlying model does not expose speaker constraints (returns None), this method also returns None.
Note: This is primarily used with CustomVoice models.
Source: qwen_tts/inference/qwen3_tts_model.py:842-858
Returns
- A sorted list of supported speaker names (lowercased), if available
Noneif the model does not provide supported speakers
Example
Generation Methods
The following generation methods are available onQwen3TTSModel instances:
generate_custom_voice()- Generate speech using predefined speaker IDsgenerate_voice_design()- Generate speech with natural-language instructionsgenerate_voice_clone()- Clone voices from reference audiocreate_voice_clone_prompt()- Build reusable voice prompts
Attributes
The underlying HuggingFace conditional generation model.
The text processor for tokenization.
Default generation parameters loaded from
generate_config.json.The device where the model is loaded (e.g.,
cuda:0, cpu).See Also
- Generation Methods - Detailed parameter documentation for all generation methods
- Voice Clone Prompt - Voice cloning workflow and prompt management
- Quickstart Guide - Getting started with Qwen3-TTS