Skip to main content
Moonshine Voice requires model files to perform transcription and intent recognition. This guide covers downloading, locating, and managing these models.

Quick Start

Download a model using the Python module:
python -m moonshine_voice.download --language en
The script will:
  1. Download the model files
  2. Cache them locally
  3. Display the model path and architecture number
Example output:
encoder_model.ort: 100%|███████████████████| 29.9M/29.9M [00:00<00:00, 34.5MB/s]
decoder_model_merged.ort: 100%|████████████| 104M/104M [00:02<00:00, 52.6MB/s]
tokenizer.bin: 100%|████████████████████████| 244k/244k [00:00<00:00, 1.44MB/s]
Model arch: 1
Downloaded model path: /Users/username/Library/Caches/moonshine_voice/download.moonshine.ai/model/base-en/quantized/base-en

Supported Languages

Moonshine Voice supports these languages:
CodeLanguageModel Sizes
enEnglishTiny, Tiny-Streaming, Base, Small-Streaming, Medium-Streaming
esSpanishBase
arArabicBase
jaJapaneseTiny, Base
koKoreanTiny
zhChineseBase
viVietnameseBase
ukUkrainianBase

Download by Language Code

# English
python -m moonshine_voice.download --language en

# Spanish
python -m moonshine_voice.download --language es

# Japanese
python -m moonshine_voice.download --language ja

# Arabic
python -m moonshine_voice.download --language ar

Download by Language Name

python -m moonshine_voice.download --language English
python -m moonshine_voice.download --language Spanish
python -m moonshine_voice.download --language Japanese

Model Architectures

Available model architectures (from smallest to largest):

English Models

ArchitectureCodeParametersWERUse Case
TINY026M12.66%Constrained devices
TINY_STREAMING234M12.00%Real-time, low-end devices
BASE158M10.07%Balanced offline processing
SMALL_STREAMING3123M7.84%Real-time, good accuracy
MEDIUM_STREAMING4245M6.65%Real-time, best accuracy

Download Specific Architecture

# Download English Medium Streaming (highest accuracy)
python -m moonshine_voice.download --language en --model-arch 4

# Download English Tiny (smallest)
python -m moonshine_voice.download --language en --model-arch 0
If you don’t specify --model-arch, the highest quality model for that language is downloaded by default.

Model Components

Streaming Model Components

Streaming models (Tiny-Streaming, Small-Streaming, Medium-Streaming) contain:
  • encoder.ort - Audio encoder
  • adapter.ort - Adapter layer
  • cross_kv.ort - Cross-attention key-value cache
  • decoder_kv.ort - Decoder key-value cache
  • frontend.ort - Audio preprocessing
  • streaming_config.json - Model configuration
  • tokenizer.bin - Text tokenizer

Non-Streaming Model Components

Non-streaming models (Tiny, Base) contain:
  • encoder_model.ort - Audio encoder
  • decoder_model_merged.ort - Merged decoder
  • tokenizer.bin - Text tokenizer

Using Models in Code

Automatic Download and Load

from moonshine_voice import get_model_for_language, Transcriber

# Download (if needed) and get path
model_path, model_arch = get_model_for_language("en")

# Create transcriber
transcriber = Transcriber(
    model_path=model_path,
    model_arch=model_arch
)

Specify Architecture

from moonshine_voice import get_model_for_language, ModelArch

# Get Medium Streaming model
model_path, model_arch = get_model_for_language(
    wanted_language="en",
    wanted_model_arch=ModelArch.MEDIUM_STREAMING
)

# Get Tiny model
model_path, model_arch = get_model_for_language(
    wanted_language="en",
    wanted_model_arch=ModelArch.TINY
)

Manual Path Specification

If you already have models downloaded:
from moonshine_voice import Transcriber, ModelArch

transcriber = Transcriber(
    model_path="/path/to/model/directory",
    model_arch=ModelArch.BASE
)

Cache Location

Models are cached in:
  • macOS: ~/Library/Caches/moonshine_voice/
  • Linux: ~/.cache/moonshine_voice/
  • Windows: %LOCALAPPDATA%\moonshine_voice\Cache\

Custom Cache Location

Set the MOONSHINE_VOICE_CACHE environment variable:
# Linux/macOS
export MOONSHINE_VOICE_CACHE=/my/custom/cache
python -m moonshine_voice.download --language en

# Windows
set MOONSHINE_VOICE_CACHE=C:\my\custom\cache
python -m moonshine_voice.download --language en

Embedding Models (for Intent Recognition)

Download embedding models for intent recognition:
from moonshine_voice import get_embedding_model

# Download with default variant (q4)
embedding_path, embedding_arch = get_embedding_model(
    "embeddinggemma-300m"
)

# Download with specific variant
embedding_path, embedding_arch = get_embedding_model(
    "embeddinggemma-300m",
    variant="q8"  # Options: q4, q8, fp16, fp32, q4f16
)

Embedding Model Variants

VariantSizeQualityUse Case
q4SmallestGoodDefault, edge devices
q8SmallBetterBalanced
fp16MediumHighGPU inference
fp32LargeHighestResearch, accuracy-critical
q4f16MediumHighMixed precision
# Command line download
python -c "from moonshine_voice import get_embedding_model; get_embedding_model('embeddinggemma-300m', 'q4')"

Model Information

Get detailed model information:
from moonshine_voice.download import log_model_info

# Print detailed information
log_model_info(wanted_language="en")
Output:
Model download url: https://download.moonshine.ai/model/medium-streaming-en/quantized
Model components: ['adapter.ort', 'cross_kv.ort', 'decoder_kv.ort', 'encoder.ort', 'frontend.ort', 'streaming_config.json', 'tokenizer.bin']
Model arch: 4
Downloaded model path: /Users/username/Library/Caches/moonshine_voice/...

Checking Available Languages

from moonshine_voice.download import (
    supported_languages,
    supported_languages_friendly
)

# Get list of language codes
languages = supported_languages()
print(languages)  # ['ar', 'es', 'en', 'ja', 'ko', 'vi', 'uk', 'zh']

# Get friendly description
print(supported_languages_friendly())
# Output: ar (Arabic), es (Spanish), en (English), ja (Japanese), ...

Error Handling

try:
    model_path, model_arch = get_model_for_language("invalid_lang")
except ValueError as e:
    print(f"Error: {e}")
    # Handle invalid language

try:
    model_path, model_arch = get_model_for_language(
        "en",
        wanted_model_arch=999  # Invalid architecture
    )
except ValueError as e:
    print(f"Error: {e}")
    # Handle invalid architecture

Model Files Structure

moonshine_voice/
└── download.moonshine.ai/
    └── model/
        ├── base-en/
        │   └── quantized/
        │       └── base-en/
        │           ├── encoder_model.ort
        │           ├── decoder_model_merged.ort
        │           └── tokenizer.bin
        ├── medium-streaming-en/
        │   └── quantized/
        │       ├── encoder.ort
        │       ├── adapter.ort
        │       ├── cross_kv.ort
        │       ├── decoder_kv.ort
        │       ├── frontend.ort
        │       ├── streaming_config.json
        │       └── tokenizer.bin
        └── embeddinggemma-300m/
            ├── model.onnx
            ├── model.onnx_data
            ├── model_q4.onnx
            ├── model_q4.onnx_data
            └── tokenizer.bin

Offline Usage

Once models are downloaded, they work offline:
  1. Download models while online:
    python -m moonshine_voice.download --language en
    
  2. Copy cache directory to offline machine:
    # On source machine
    tar -czf moonshine_models.tar.gz ~/.cache/moonshine_voice
    
    # On target machine
    tar -xzf moonshine_models.tar.gz -C ~/
    
  3. Use normally (no internet needed):
    model_path, model_arch = get_model_for_language("en")
    

Platform-Specific Notes

Windows

For Windows C++ applications:
# Install Python package first
pip install moonshine-voice

# Download models
python -m moonshine_voice.download --language en

# Note the model path from output
# Use in your C++ application

Raspberry Pi

# Install with system packages flag
sudo pip install --break-system-packages moonshine-voice

# Download model
python -m moonshine_voice.download --language en
On Raspberry Pi, prefer smaller models (Tiny, Tiny-Streaming) for better performance.

Licensing

Non-English models are released under the Moonshine Community License (non-commercial use).English models are available under Apache 2.0.See https://www.moonshine.ai/license for details.

See Also

Build docs developers (and LLMs) love