Supported Models

This page lists all supported model types for Speech-to-Text (STT) and Text-to-Speech (TTS), including required files and download links.

Speech-to-Text (STT) Models

For real-time (streaming) recognition from a microphone or audio stream, use streaming-capable model types: transducer, paraformer, zipformer2_ctc, nemo_ctc, or tone_ctc.See the Streaming STT documentation for details.

Zipformer/Transducer

Model Type: transducerRequired Files:

encoder.onnx
decoder.onnx
joiner.onnx
tokens.txt

Download: Offline Transducer Models

Paraformer

Model Type: paraformerRequired Files:

model.onnx (or model.int8.onnx)
tokens.txt

Download: Offline Paraformer Models

NeMo CTC

Model Type: nemo_ctcRequired Files:

model.onnx (or model.int8.onnx)
tokens.txt

Download: NeMo CTC Models

Whisper

Model Type: whisperRequired Files:

encoder.onnx
decoder.onnx
tokens.txt

Download: Whisper Models

WeNet CTC

Model Type: wenet_ctcRequired Files:

model.onnx (or model.int8.onnx)
tokens.txt

Download: WeNet CTC Models

SenseVoice

Model Type: sense_voiceRequired Files:

model.onnx (or model.int8.onnx)
tokens.txt

Download: SenseVoice Models

FunASR Nano

Model Type: funasr_nanoRequired Files:

encoder_adaptor.onnx
llm.onnx
embedding.onnx
tokenizer/ directory

Download: FunASR Nano Models

Tone CTC (t-one)

Model Type: tone_ctcRequired Files:

model.onnx
tokens.txt

Note: Folder name usually contains t-one, t_one or toneDownload: Online CTC Models

Text-to-Speech (TTS) Models

For streaming TTS (incremental generation, low latency), use createStreamingTTS() with supported model types.See the Streaming TTS documentation for details.

VITS

Model Type: vitsDescription: Fast, high-quality TTS. Includes Piper, Coqui, MeloTTS, MMS variants.Required Files:

model.onnx
tokens.txt

Download: TTS Models Release

Matcha

Model Type: matchaDescription: High-quality acoustic model + vocoderRequired Files:

acoustic_model.onnx
vocoder.onnx
tokens.txt

Download: Matcha Models

Kokoro

Model Type: kokoroDescription: Multi-speaker, multi-languageRequired Files:

model.onnx
voices.bin
tokens.txt
espeak-ng-data/ directory

Download: TTS Models Release

KittenTTS

Model Type: kittenDescription: Lightweight, multi-speakerRequired Files:

model.onnx
voices.bin
tokens.txt
espeak-ng-data/ directory

Download: TTS Models Release

Zipvoice

Model Type: zipvoiceDescription: Voice cloning capableRequired Files:

encoder.onnx
decoder.onnx
vocoder.onnx
tokens.txt

Download: Zipvoice Models

Pocket

Model Type: pocketDescription: Flow-matching TTSRequired Files:

lm_flow.onnx
lm_main.onnx
encoder.onnx
decoder.onnx
text_conditioner.onnx
vocab.json
token_scores.json

Download: TTS Models Release

Model Quantization

The SDK automatically detects and prefers quantized (int8) models when available. For example, if both model.onnx and model.int8.onnx exist, the library chooses according to the preferInt8 option in init.

Auto-Detection

The library detects model types from the files present in each model directory. Folder and file names do not need to follow any fixed convention. To use auto-detection:

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/your-model-folder' },
  modelType: 'auto', // Auto-detect from files
});

Model Setup

Learn how to bundle, download, and manage models

Download Manager

Download models in-app with progress tracking

STT API

Speech-to-Text API reference

TTS API

Text-to-Speech API reference

Get Started

Core Features

Guides

Platform Specific

Advanced

Speech-to-Text (STT) Models

Zipformer/Transducer

Paraformer

NeMo CTC

Whisper

WeNet CTC

SenseVoice

FunASR Nano

Tone CTC (t-one)

Text-to-Speech (TTS) Models

VITS

Matcha

Kokoro

KittenTTS

Zipvoice

Pocket

Model Quantization

Auto-Detection

See Also

Model Setup

Download Manager

STT API

TTS API

Build docs developers (and LLMs) love

Get Started

Core Features

Guides

Platform Specific

Advanced

​Speech-to-Text (STT) Models

Zipformer/Transducer

Paraformer

NeMo CTC

Whisper

WeNet CTC

SenseVoice

FunASR Nano

Tone CTC (t-one)

​Text-to-Speech (TTS) Models

VITS

Matcha

Kokoro

KittenTTS

Zipvoice

Pocket

​Model Quantization

​Auto-Detection

​See Also

Model Setup

Download Manager

STT API

TTS API

Build docs developers (and LLMs) love

Speech-to-Text (STT) Models

Text-to-Speech (TTS) Models

Model Quantization

Auto-Detection

See Also