Skip to main content

Overview

The qwen-tts-demo command launches an interactive Gradio web interface for testing and using Qwen3-TTS models. It supports all three model types: CustomVoice, VoiceDesign, and Base (voice cloning).

Installation

The command is automatically available after installing the qwen-tts package:
pip install qwen-tts

Basic Usage

qwen-tts-demo MODEL_PATH [OPTIONS]

Quick Start Examples

# CustomVoice model
qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --ip 0.0.0.0 --port 8000

# VoiceDesign model
qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --ip 0.0.0.0 --port 8000

# Base model (voice cloning)
qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-Base --ip 0.0.0.0 --port 8000
After launching, open http://<your-ip>:8000 in your browser, or access via port forwarding in tools like VS Code.

Command Options

Model Checkpoint

checkpoint
string
required
Model checkpoint path or HuggingFace repo ID (positional argument)Can also be specified with -c or --checkpoint flag.Examples:
  • Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
  • ./local/path/to/model

Model Loading Options

--device
string
default:"cuda:0"
Device for model inferenceOptions: cpu, cuda, cuda:0, cuda:1, etc.
--dtype
string
default:"bfloat16"
Torch dtype for loading the modelChoices: bfloat16, bf16, float16, fp16, float32, fp32
--flash-attn / --no-flash-attn
boolean
default:"true"
Enable or disable FlashAttention-2 for efficient memory usageFlashAttention-2 is recommended for better performance. Requires compatible hardware.

Server Configuration

--ip
string
default:"0.0.0.0"
Server bind IP address for GradioUse 0.0.0.0 to allow external connections, or 127.0.0.1 for local only.
--port
integer
default:"8000"
Server port for Gradio
--share / --no-share
boolean
default:"false"
Create a public Gradio link for sharingWhen enabled, generates a temporary public URL.
--concurrency
integer
default:"16"
Gradio queue concurrency limit

HTTPS/SSL Options

--ssl-certfile
string
Path to SSL certificate file for HTTPS (optional)Required for secure connections and browser microphone access in Base model.
--ssl-keyfile
string
Path to SSL private key file for HTTPS (optional)
--ssl-verify / --no-ssl-verify
boolean
default:"true"
Whether to verify SSL certificateUse --no-ssl-verify for self-signed certificates.

Generation Parameters

--max-new-tokens
integer
Maximum new tokens for generation (optional)
--temperature
float
Sampling temperature (optional)
--top-k
integer
Top-k sampling (optional)
--top-p
float
Top-p sampling (optional)
--repetition-penalty
float
Repetition penalty (optional)
--subtalker-top-k
integer
Subtalker top-k (optional, only for tokenizer v2)
--subtalker-top-p
float
Subtalker top-p (optional, only for tokenizer v2)
--subtalker-temperature
float
Subtalker temperature (optional, only for tokenizer v2)

HTTPS Setup for Base Model

For Base model deployments, HTTPS is recommended to avoid browser microphone permission issues:

Generate Self-Signed Certificate

openssl req -x509 -newkey rsa:2048 \
  -keyout key.pem -out cert.pem \
  -days 365 -nodes \
  -subj "/CN=localhost"

Launch with HTTPS

qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-Base \
  --ip 0.0.0.0 --port 8000 \
  --ssl-certfile cert.pem \
  --ssl-keyfile key.pem \
  --no-ssl-verify
Then open https://<your-ip>:8000 in your browser.
Your browser may show a security warning for self-signed certificates. This is expected. For production deployments, use a real certificate from a trusted Certificate Authority.

Model-Specific Features

CustomVoice Demo

Provides interface for:
  • Text input with language selection
  • Speaker selection (9 premium voices)
  • Optional instruction control (e.g., “Say it in a very angry tone”)

VoiceDesign Demo

Provides interface for:
  • Text input with language selection
  • Natural language voice design instructions
  • Dynamic voice generation based on descriptions

Base Model Demo

Provides two tabs:
  1. Clone & Generate: Upload reference audio with transcript, then synthesize new text
  2. Save/Load Voice: Save voice clones as reusable prompt files for consistent character voices

Help Command

View all available options:
qwen-tts-demo --help

Build docs developers (and LLMs) love