Skip to main content

Other STT Models

This page covers additional STT model types supported by react-native-sherpa-onnx, including specialized and emerging architectures.

Overview

WeNet CTC

Compact CTC models from WeNet framework

SenseVoice

Multilingual with emotion detection and punctuation

FunASR Nano

LLM-based ASR with prompt customization

Moonshine

Modern streaming-capable lightweight ASR

Fire Red ASR

Encoder-decoder ASR models

Dolphin

Single-model CTC for compact deployment

Canary

NeMo multilingual model

Omnilingual

Wide language coverage CTC model

MedASR

Medical ASR for healthcare applications

Telespeech CTC

Telephony-optimized CTC model

Tone CTC

Ultra-lightweight streaming CTC (t-one)

WeNet CTC

modelType: 'wenet_ctc'

Description

CTC models from the WeNet framework, designed for compact deployment.

Characteristics

  • Streaming: ❌ No (offline only)
  • Speed: ⭐⭐⭐⭐⭐ Very Fast
  • Size: Small (compact models)
  • Languages: Limited (depends on model variant)

Configuration

import { createSTT } from 'react-native-sherpa-onnx/stt';

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-wenet-chinese' },
  modelType: 'wenet_ctc',
  preferInt8: true,
});

Download

WeNet CTC Models

Model Detection

  • Folder name should contain wenet
  • Files: model.onnx, tokens.txt

SenseVoice

modelType: 'sense_voice'

Description

Multilingual model with emotion detection and automatic punctuation. Excellent for applications requiring sentiment analysis.

Characteristics

  • Streaming: ❌ No
  • Accuracy: ⭐⭐⭐⭐
  • Languages: Chinese, English, Cantonese, Japanese, Korean
  • Special: Emotion labels + punctuation

Configuration

import { createSTT, getSenseVoiceLanguages } from 'react-native-sherpa-onnx/stt';

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-sense-voice-zh-en' },
  modelType: 'sense_voice',
  modelOptions: {
    senseVoice: {
      language: 'auto', // 'auto', 'zh', 'en', 'yue', 'ja', 'ko'
      useItn: true,     // Inverse text normalization
    }
  },
});

const result = await stt.transcribeFile('/path/to/audio.wav');
console.log('Text:', result.text);
console.log('Emotion:', result.emotion); // e.g. 'happy', 'neutral'

Language Helpers

const languages = getSenseVoiceLanguages();
// [{ id: 'auto', name: 'Auto' }, { id: 'zh', name: 'Chinese' }, ...]

Download

SenseVoice Models

Model Detection

  • Folder name should contain sense or sensevoice

FunASR Nano

modelType: 'funasr_nano'

Description

Lightweight LLM-based ASR with customizable system/user prompts. Supports advanced decoding options.

Characteristics

  • Streaming: ❌ No
  • Special: LLM-based with prompt engineering
  • Languages: Chinese, English, Japanese (depends on variant)

Configuration

import { createSTT, getFunasrNanoLanguages } from 'react-native-sherpa-onnx/stt';

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-funasr-nano-zh' },
  modelType: 'funasr_nano',
  modelOptions: {
    funasrNano: {
      systemPrompt: 'You are a speech recognition system.',
      userPrompt: 'Transcribe the following audio.',
      language: '中文',        // Chinese labels: '中文', '英文', '日文'
      itn: true,              // Inverse text normalization
      hotwords: 'React Native:2.5,Sherpa ONNX:3.0',
      maxNewTokens: 512,
      temperature: 0.8,
      topP: 0.95,
      seed: 42,
    }
  },
});

Language Helpers

const languages = getFunasrNanoLanguages();
// [{ id: '中文', name: 'Chinese' }, { id: '英文', name: 'English' }, ...]

Download

FunASR Nano Models

Model Detection

  • Folder name should contain funasr or funasr-nano
  • Files: encoder_adaptor, llm, embedding, tokenizer directory

Moonshine

modelType: 'moonshine' (v1) or 'moonshine_v2' (v2)

Description

Modern streaming-capable ASR with two architecture versions. Moonshine v1: Four-part architecture (preprocess, encode, uncached/cached decode)
Moonshine v2: Two-part architecture (encoder + merged decoder)

Characteristics

  • Streaming: ✅ Yes (both v1 and v2)
  • Speed: ⭐⭐⭐⭐
  • Languages: Limited (check model variant)

Configuration

import { createStreamingSTT } from 'react-native-sherpa-onnx/stt';

// Moonshine v2 (recommended)
const engine = await createStreamingSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-moonshine-v2' },
  modelType: 'auto', // Detects v2 if both present
});

// Moonshine v1
const engineV1 = await createStreamingSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-moonshine-v1' },
  modelType: 'moonshine',
});

Download

Moonshine Models

Model Detection

  • Folder name should contain moonshine
  • V1: preprocess.onnx, encode.onnx, uncached_decode.onnx, cached_decode.onnx
  • V2: encoder.onnx or encoder.ort, merged decoder

Fire Red ASR

modelType: 'fire_red_asr'

Description

Encoder-decoder ASR models from the Fire Red project.

Characteristics

  • Streaming: ❌ No
  • Speed: ⭐⭐⭐
  • Languages: Limited (depends on variant)

Configuration

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-fire-red-asr' },
  modelType: 'fire_red_asr',
});

Download

Fire Red ASR Models

Model Detection

  • Folder name should contain fire_red or fire-red
  • Files: encoder, decoder directories

Dolphin

modelType: 'dolphin'

Description

Single-model CTC for compact deployment.

Characteristics

  • Streaming: ❌ No
  • Speed: ⭐⭐⭐⭐⭐
  • Size: Very Small
  • Languages: Limited

Configuration

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-dolphin' },
  modelType: 'dolphin',
  preferInt8: true,
});

Download

Dolphin Models

Model Detection

  • Folder name should contain dolphin
  • Files: model.onnx, tokens.txt

Canary

modelType: 'canary'

Description

NeMo Canary multilingual model with source/target language configuration.

Characteristics

  • Streaming: ❌ No
  • Multilingual: ✅ Yes (English, Spanish, German, French)
  • Accuracy: ⭐⭐⭐⭐

Configuration

import { createSTT, getCanaryLanguages } from 'react-native-sherpa-onnx/stt';

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-nemo-canary' },
  modelType: 'canary',
  modelOptions: {
    canary: {
      srcLang: 'en',    // Source: English, Spanish, German, French
      tgtLang: 'en',    // Target (typically 'en')
      usePnc: true,     // Use punctuation
    }
  },
});

Language Helpers

const languages = getCanaryLanguages();
// [{ id: 'en', name: 'English' }, { id: 'es', name: 'Spanish' }, ...]

Download

Canary Models

Model Detection

  • Folder name should contain canary

Omnilingual

modelType: 'omnilingual'

Description

Omnilingual CTC model with wide language coverage.

Characteristics

  • Streaming: ❌ No
  • Multilingual: ✅ Yes (many languages)
  • Speed: ⭐⭐⭐

Configuration

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-omnilingual' },
  modelType: 'omnilingual',
});

Download

Omnilingual Models

Model Detection

  • Folder name should contain omnilingual

MedASR

modelType: 'medasr'

Description

Medical ASR CTC model optimized for healthcare terminology.

Characteristics

  • Streaming: ❌ No
  • Domain: Medical/Healthcare
  • Speed: ⭐⭐⭐⭐

Configuration

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-medasr' },
  modelType: 'medasr',
});

Model Detection

  • Folder name should contain medasr

Telespeech CTC

modelType: 'telespeech_ctc'

Description

Telespeech CTC model optimized for telephony audio.

Characteristics

  • Streaming: ❌ No
  • Domain: Telephony (8kHz audio)
  • Speed: ⭐⭐⭐⭐

Configuration

const stt = await createSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-telespeech' },
  modelType: 'telespeech_ctc',
});

Download

Telespeech Models

Model Detection

  • Folder name should contain telespeech

Tone CTC (t-one)

modelType: 'tone_ctc'

Description

Ultra-lightweight streaming CTC model (t-one). Excellent for resource-constrained devices.

Characteristics

  • Streaming: ✅ Yes
  • Speed: ⭐⭐⭐⭐⭐ Very Fast
  • Size: Very Small
  • Memory: ⭐⭐⭐⭐⭐ Very Low

Configuration

import { createStreamingSTT } from 'react-native-sherpa-onnx/stt';

const engine = await createStreamingSTT({
  modelPath: { type: 'asset', path: 'models/sherpa-onnx-streaming-t-one-russian' },
  modelType: 'tone_ctc',
  numThreads: 2,
});

Download

Tone CTC Models

Model Detection

  • Folder name should contain t-one, t_one, or the word tone (as standalone word)
  • Files: model.onnx, tokens.txt

Comparison Table

ModelStreamingMultilingualSpeedSpecial Feature
WeNet CTCLimitedVery FastCompact
SenseVoice5 langsMediumEmotion + punctuation
FunASR NanoLimitedMediumLLM-based with prompts
MoonshineLimitedFastModern streaming
Fire Red ASRLimitedMediumEncoder-decoder
DolphinLimitedVery FastUltra-compact
Canary4 langsMediumNeMo multilingual
OmnilingualManyMediumWide coverage
MedASREnglishFastMedical domain
TelespeechLimitedFastTelephony (8kHz)
Tone CTCLimitedVery FastUltra-lightweight

Choosing a Specialized Model

For Emotion Detection

  • SenseVoice – Provides emotion labels in result

For Medical/Healthcare

  • MedASR – Optimized for medical terminology

For Telephony

  • Telespeech CTC – Designed for 8kHz phone audio

For Low-End Devices

  • Tone CTC – Ultra-lightweight streaming
  • Dolphin – Very small offline model
  • WeNet CTC – Compact deployment

For LLM-Based Flexibility

  • FunASR Nano – Prompt engineering for ASR

For Modern Streaming

  • Moonshine – Latest streaming architecture
  • Tone CTC – Lightweight streaming

Next Steps

STT Overview

Compare all STT model types

STT API

Detailed API documentation

Streaming STT

Real-time recognition guide

Model Setup

How to download and bundle models

Build docs developers (and LLMs) love