Overview
Off Grid includes on-device speech recognition powered by Whisper via whisper.rn native bindings. Speak your messages instead of typing them — all transcription happens locally on your device with no network required.How It Works
Whisper is OpenAI’s speech recognition model, compiled for mobile via whisper.cpp:- Hold to record — Press and hold the microphone button in the chat input
- Speak your message — Whisper transcribes in real-time (you’ll see partial results)
- Release to finish — Transcription completes and inserts into the input field
- Review and send — Edit if needed, then send
Available Models
Off Grid supports multiple Whisper model sizes, balancing speed vs. accuracy:- Tiny (75MB)
- Base (142MB)
- Small (466MB)
Whisper Tiny
- Size: 75MB
- Speed: Fastest, real-time transcription
- Accuracy: Good for clear speech
- Best for: Quick messages, casual conversations
- English-only (
tiny.en) — Optimized for English - Multilingual (
tiny) — Supports multiple languages
Multilingual models support 99+ languages including Spanish, French, German, Chinese, Japanese, Arabic, and more. If you only speak English, use the
.en variants for slightly better performance.How to Use
1. Download a Whisper Model
- Go to Settings → Voice Settings
- Select a model (Base Multilingual recommended for first-time users)
- Tap Download and wait for it to complete
- The model is automatically set as active
Whisper models download on first use if not already installed. You’ll see a download progress indicator in the voice settings screen.
2. Grant Microphone Permission
The first time you use voice input:- Android: You’ll see a permission dialog requesting microphone access
- iOS: Audio session is configured automatically and triggers the permission prompt
3. Record Your Message
- Open any conversation
- Tap and hold the microphone button in the chat input
- Speak your message clearly
- Release when done
- Review the transcription in the input field
- Edit if needed, then send
Slide to Cancel
Changed your mind while recording?- Slide your finger left while holding the mic button
- Release to cancel the recording
- No transcription is performed
Language Support
Whisper multilingual models support 99 languages, including:- European: English, Spanish, French, German, Italian, Portuguese, Russian, Polish, Dutch, Swedish, Norwegian, Danish, Finnish, Greek, Turkish
- Asian: Chinese (Mandarin & Cantonese), Japanese, Korean, Hindi, Thai, Vietnamese, Indonesian, Malay, Tagalog
- Middle Eastern: Arabic, Hebrew, Persian, Urdu
- And many more…
Setting Language
By default, Whisper auto-detects the spoken language. You can manually specify a language in Settings → Voice Settings → Language if auto-detection isn’t accurate.Language setting uses IANA language codes (e.g.,
en for English, es for Spanish, zh for Chinese). Check the Whisper documentation for the full list of supported languages.Performance
Whisper transcription is real-time on most devices:| Model | Device Class | Speed | Use Case |
|---|---|---|---|
| Tiny | All devices | Real-time | Quick messages |
| Base | Mid-range+ | Near real-time | General use |
| Small | Flagship | Slight delay | High accuracy |
Factors Affecting Speed
- Model size — Larger models are slower but more accurate
- Device CPU — Faster processors = faster transcription
- Audio length — Longer recordings take more time to process
- Background noise — More noise = more processing time
Technical Details
How Whisper Works
- Audio capture — whisper.rn records audio via native audio APIs
- Preprocessing — Audio is converted to the format Whisper expects (16kHz mono)
- Inference — whisper.cpp processes the audio and generates transcription
- Streaming results — Partial transcriptions are sent to React Native via callbacks
- Final output — Complete transcription is inserted into the chat input
Real-time Transcription API
Off Grid uses whisper.rn’stranscribeRealtime API:
- 30-second chunks — Audio is processed in 30-second segments
- 3-second slices — Intermediate results every 3 seconds for responsive UI
- Streaming events —
subscribe()receives events withisCapturing,text,processTime, etc.
Storage Location
Whisper models are stored in:Audio Session (iOS)
On iOS, whisper.rn configures the audio session:- Category:
PlayAndRecord(allows recording + playback) - Options:
AllowBluetooth,MixWithOthers(Bluetooth headset support, mix with other audio) - Mode:
Default - Restore on stop — Audio session is restored to previous state after recording
Permissions
Android:- Requires
RECORD_AUDIOpermission - Requested on first use via
PermissionsAndroid.request()
- Microphone permission triggered when audio session is activated
- Configured automatically by whisper.rn
Tips
Getting the Best Transcription Quality
- Speak clearly — Enunciate words, avoid mumbling
- Minimize background noise — Find a quiet environment
- Use a good microphone — Built-in mic works, but Bluetooth headsets are better
- Short sentences — Pause between thoughts for better accuracy
- Use the right model — Base for general use, Small for noisy environments
Choosing the Right Model
- Speed priority: Tiny (English or Multilingual)
- Balanced: Base Multilingual (recommended)
- Accuracy priority: Small (English or Multilingual)
- English-only users: Use
.envariants for slightly better performance - Multilingual users: Use multilingual variants and let Whisper auto-detect language
Troubleshooting
Transcription is slow:- Try a smaller model (Base instead of Small)
- Ensure no other apps are using the microphone
- Check CPU usage in Settings → Device Info
- Go to device Settings → Apps → Off Grid → Permissions → Microphone → Allow
- Restart the app after granting permission
- Try a larger model (Small instead of Tiny)
- Speak more clearly and reduce background noise
- Manually set the language in Voice Settings if auto-detection is wrong
- Release the mic button fully
- If stuck, force-stop the app and restart
- Check logs for errors
- Check if a Whisper model is downloaded (Settings → Voice Settings)
- Ensure microphone permission is granted
- Try recording again (sometimes first attempt fails)
Privacy
All voice transcription happens 100% on-device:- Your voice never leaves your device
- No cloud API calls
- No audio uploaded to servers
- Works completely offline (after model download)