Runtime Mode Selection
Speech-to-text runtime modeChoose between:
online- Use cloud-based STT service via APIlocal- Use local STT models (Parakeet)
AI inference runtime modeChoose between:
online- Use cloud-based AI service via APIlocal- Use local AI models via Ollama
Legacy runtime mode (applies to both STT and AI)This is maintained for backward compatibility. Use
sttRuntimeMode and aiRuntimeMode for independent control.Online Model Settings
Configure cloud-based model providers for online mode.API base URL for online model providerExamples:
- OpenAI:
https://api.openai.com/v1 - Custom provider endpoint
API authentication key for online services
Keep your API key secure. Enable
rememberApiKey to persist it across sessions.Persist API key in settings storageWhen enabled, your API key is saved locally and restored on app restart.
Model name for online speech-to-textExamples:
whisper-1- Provider-specific model identifier
Model name for online AI inferenceExamples:
gpt-4gpt-3.5-turboclaude-3-opus-20240229
Local Model Settings
Configure offline models for local inference.Local STT Models
Local speech-to-text model selectionAvailable models:
Parakeet Models (Recommended)
Parakeet Models (Recommended)
- Parakeet v3 (
nvidia/parakeet-tdt-0.6b-v3) - 478 MB - Latest version - Parakeet v2 (
nvidia/parakeet-tdt_ctc-110m) - 473 MB - Stable version - Parakeet v2 Legacy (
nvidia/parakeet-tdt-0.6b-v2) - 473 MB
Whisper Models
Whisper Models
- Whisper Turbo (
openai/whisper-large-v3-turbo) - 1.6 GB - Fastest large model - Whisper Large v3 (
openai/whisper-large-v3) - 1.1 GB - Most accurate - Whisper Medium (
openai/whisper-medium) - 492 MB - Balanced - Whisper Small (
openai/whisper-small) - 487 MB - Lightweight
Other Models
Other Models
- SenseVoice (
FunAudioLLM/SenseVoiceSmall) - 160 MB - Compact option - Moonshine Base (
UsefulSensors/moonshine-base) - 58 MB - Ultra-lightweight
Models must be downloaded before use. SlasshyWispr will guide you through the download process in Settings > Models.
Local AI Models (Ollama)
Ollama service base URLDefault points to local Ollama instance. Change if running Ollama on a different host or port.
Ollama model name for local AI inferenceExamples:
llama2mistralcodellamaphi3
Ollama must be installed and running separately. Pull models using
ollama pull <model-name>.Setup Guide
- Online Mode
- Offline Mode
- Hybrid Mode
- Open Settings > Models
- Set STT Runtime to
Online - Set AI Runtime to
Online - Enter your API Base URL
- Enter your API Key
- Enable “Remember API Key” (optional)
- Specify STT and AI model names
- Test with a quick dictation
Best Practices
- Privacy-first: Use local mode to keep all data on-device
- Speed-first: Use online mode for fastest inference
- Balanced: Use Parakeet v3 locally with online AI for fast STT and powerful responses
- Hardware advisor: SlasshyWispr analyzes your CPU/GPU and suggests optimal local models
- Incognito mode: Enable in Advanced Settings to prevent history logging regardless of runtime mode