Speech-to-Text (STT) Models
For real-time (streaming) recognition from a microphone or audio stream, use streaming-capable model types:
transducer, paraformer, zipformer2_ctc, nemo_ctc, or tone_ctc.See the Streaming STT documentation for details.Zipformer/Transducer
Model Type:
transducerRequired Files:encoder.onnxdecoder.onnxjoiner.onnxtokens.txt
Paraformer
Model Type:
paraformerRequired Files:model.onnx(ormodel.int8.onnx)tokens.txt
NeMo CTC
Model Type:
nemo_ctcRequired Files:model.onnx(ormodel.int8.onnx)tokens.txt
Whisper
WeNet CTC
Model Type:
wenet_ctcRequired Files:model.onnx(ormodel.int8.onnx)tokens.txt
SenseVoice
Model Type:
sense_voiceRequired Files:model.onnx(ormodel.int8.onnx)tokens.txt
FunASR Nano
Model Type:
funasr_nanoRequired Files:encoder_adaptor.onnxllm.onnxembedding.onnxtokenizer/directory
Tone CTC (t-one)
Model Type:
tone_ctcRequired Files:model.onnxtokens.txt
t-one, t_one or toneDownload: Online CTC ModelsText-to-Speech (TTS) Models
For streaming TTS (incremental generation, low latency), use
createStreamingTTS() with supported model types.See the Streaming TTS documentation for details.VITS
Model Type:
vitsDescription: Fast, high-quality TTS. Includes Piper, Coqui, MeloTTS, MMS variants.Required Files:model.onnxtokens.txt
Matcha
Model Type:
matchaDescription: High-quality acoustic model + vocoderRequired Files:acoustic_model.onnxvocoder.onnxtokens.txt
Kokoro
Model Type:
kokoroDescription: Multi-speaker, multi-languageRequired Files:model.onnxvoices.bintokens.txtespeak-ng-data/directory
KittenTTS
Model Type:
kittenDescription: Lightweight, multi-speakerRequired Files:model.onnxvoices.bintokens.txtespeak-ng-data/directory
Zipvoice
Model Type:
zipvoiceDescription: Voice cloning capableRequired Files:encoder.onnxdecoder.onnxvocoder.onnxtokens.txt
Model Type:
pocketDescription: Flow-matching TTSRequired Files:lm_flow.onnxlm_main.onnxencoder.onnxdecoder.onnxtext_conditioner.onnxvocab.jsontoken_scores.json
Model Quantization
The SDK automatically detects and prefers quantized (int8) models when available. For example, if both
model.onnx and model.int8.onnx exist, the library chooses according to the preferInt8 option in init.Auto-Detection
The library detects model types from the files present in each model directory. Folder and file names do not need to follow any fixed convention. To use auto-detection:See Also
Model Setup
Learn how to bundle, download, and manage models
Download Manager
Download models in-app with progress tracking
STT API
Speech-to-Text API reference
TTS API
Text-to-Speech API reference