transcribe-rs.
Available Models
Choose from several high-quality models optimized for different performance and accuracy requirements:Parakeet v3
nvidia/parakeet-tdt-0.6b-v3Size: 478 MBLatest NVIDIA Parakeet model with improved accuracy
Whisper Turbo
openai/whisper-large-v3-turboSize: 1.6 GBFastest large Whisper model with excellent accuracy
Moonshine Base
UsefulSensors/moonshine-baseSize: 58.0 MBSmallest model, ideal for low-resource systems
SenseVoice
FunAudioLLM/SenseVoiceSmallSize: 160 MBCompact multilingual model with emotion detection
Complete Model List
| Model | Size | Description |
|---|---|---|
| nvidia/parakeet-tdt-0.6b-v3 | 478 MB | Parakeet v3 - Latest generation |
| nvidia/parakeet-tdt_ctc-110m | 473 MB | Parakeet v2 - Previous generation |
| nvidia/parakeet-tdt-0.6b-v2 | 473 MB | Parakeet v2 (alternate) |
| openai/whisper-large-v3-turbo | 1.6 GB | Whisper Turbo - Fastest large model |
| openai/whisper-large-v3 | 1.1 GB | Whisper Large - High accuracy |
| openai/whisper-medium | 492 MB | Whisper Medium - Balanced |
| openai/whisper-small | 487 MB | Whisper Small - Efficient |
| UsefulSensors/moonshine-base | 58.0 MB | Moonshine - Smallest footprint |
| FunAudioLLM/SenseVoiceSmall | 160 MB | SenseVoice - Multilingual |
Model Performance
Performance Tiers
High Performance
Whisper Turbo and Whisper Large models offer the highest accuracy but require more RAM and processing power.Recommended for: High-end systems with 16GB+ RAM
Balanced Performance
Parakeet v3, Whisper Medium, and Whisper Small provide excellent accuracy with moderate resource usage.Recommended for: Mid-range systems with 8-16GB RAM
SlasshyWispr will analyze your hardware and recommend the best model for your system. See Hardware Requirements for details.
Download and Installation
Download the Model
Click Download Model to begin downloading from Hugging FaceThe download progress will be displayed with:
- Current file being downloaded
- Download percentage
- Files completed / total files
- Downloaded bytes / total bytes
Models are downloaded from Hugging Face and stored locally in your SlasshyWispr data directory. Once downloaded, they can be used completely offline.
Model Warmup Process
Before first use, models need to be “warmed up” to load into memory and optimize performance.What is Warmup?
Warmup involves:- Loading the model into memory
- Initializing the inference engine
- Running a test transcription to optimize caching
- Preparing the model for real-time use
When Does Warmup Happen?
Warmup occurs automatically when you:
- Select a model for the first time
- Switch to a different model
- Restart SlasshyWispr with a local model enabled
Warmup Duration
Warmup time varies by model size:- Small models (< 100 MB): 5-10 seconds
- Medium models (400-500 MB): 10-20 seconds
- Large models (> 1 GB): 20-40 seconds
Native Parakeet Runtime
SlasshyWispr uses transcribe-rs with native Parakeet support for high-performance local transcription.Key Features
- Zero Python Dependencies: All models run natively in Rust via ONNX Runtime
- Low Latency: Optimized for real-time transcription with minimal delay
- Cross-Platform: Works on Windows, macOS, and Linux
- GPU Acceleration: Automatic NVIDIA GPU detection and utilization when available
- Memory Efficient: Smart memory management for concurrent model loading
Runtime Architecture
The local STT runtime:- Uses ONNX Runtime 2.0 (
ort = "2.0.0-rc.10") for model inference - Leverages transcribe-rs with Parakeet features enabled
- Manages model lifecycle (download, warmup, deactivate)
- Handles audio preprocessing and post-processing
- Provides daemon mode for keeping models hot in memory
The runtime automatically manages multiple model instances and can keep models “hot” in memory for instant transcription when you start dictating.
Model Management
Checking Model Status
You can check if a model is:- Downloaded and available locally
- Currently loaded in memory (warmed up)
- Active and ready for transcription
Deleting Models
To free up disk space, you can delete downloaded models through Settings > Offline. This removes the model files from your local storage.Opening Model Directory
You can open the local model storage directory to inspect or manually manage model files.Best Practices
Start with Recommended Model
Use SlasshyWispr’s hardware advisor to select the optimal model for your system
Test Multiple Models
Try different models to find the best balance of speed and accuracy for your use case
Troubleshooting
Model Won’t Download
- Check your internet connection
- Ensure you have sufficient disk space (check model size above)
- Verify Hugging Face is accessible from your network
Slow Transcription
- Try a smaller model (Moonshine, SenseVoice, or Whisper Small)
- Check if your system meets the hardware requirements
- Close other resource-intensive applications
Model Warmup Fails
- Ensure sufficient RAM is available
- Try restarting SlasshyWispr
- Check the logs for specific error messages
For GPU-accelerated transcription, see Hardware Requirements for NVIDIA GPU setup.