Commands Overview
Interactive Model Browser
- LLM Models — 9 models (350M to 4B parameters)
- STT Models — 2 offline models (Whisper, Parakeet)
- TTS Voices — 6 voice models (11-103 speakers)
Navigation
Up/Down— Navigate listEnter— Select or download modelESC— Close panel
Model States
- Active — Currently loaded (green checkmark)
- Installed — Available locally
- Not installed — Available for download (grayed out)
- Default — Included in
rcli setup - Recommended — Best for most users
Switching Models
PressEnter on a model to:
LLM Hot-Swap (Runtime)
- Unloads current model
- Loads new model to Metal GPU
- Re-detects model profile (Qwen3/LFM2/etc.)
- Re-caches system prompt with correct tool format
- Persists selection to
~/.rcli/config/model_selection.json
STT/TTS Selection (Next Launch)
Downloading Models
PressEnter on an uninstalled model to download:
curl.
LLM Models
| Model | Size | Speed | License | Features |
|---|---|---|---|---|
| LFM2 1.2B Tool | 731 MB | ~180 t/s | LFM Open | Tool calling, default |
| LFM2 350M | 219 MB | ~350 t/s | LFM Open | Fastest, 128K ctx |
| LFM2.5 1.2B Instruct | 731 MB | ~180 t/s | LFM Open | 128K ctx |
| LFM2 2.6B | 1.5 GB | ~120 t/s | LFM Open | Better conversational |
| Qwen3 0.6B | 456 MB | ~250 t/s | Apache 2.0 | Ultra-fast |
| Qwen3.5 0.8B | 600 MB | ~220 t/s | Apache 2.0 | Qwen3.5 generation |
| Qwen3.5 2B | 1.2 GB | ~150 t/s | Apache 2.0 | Recommended |
| Qwen3 4B | 2.5 GB | ~80 t/s | Apache 2.0 | Smart reasoning |
| Qwen3.5 4B | 2.7 GB | ~75 t/s | Apache 2.0 | Best small model, 262K ctx |
Upgrade LLM
STT Models
RCLI uses two STT models in parallel:Zipformer (Streaming)
- Purpose — Real-time transcription during speech
- Accuracy — Good (~5% WER)
- Speed — ~50ms latency
- Size — ~50 MB
- Included in —
rcli setup(always active)
Offline STT Models
| Model | Size | Accuracy | License | Features |
|---|---|---|---|---|
| Whisper base.en | 140 MB | ~5% WER | MIT | English, default |
| Parakeet TDT 0.6B v3 | 640 MB | ~1.9% WER | CC-BY-4.0 | 25 languages, auto-punctuation |
Upgrade STT
TTS Voices
| Voice | Size | Speakers | License | Features |
|---|---|---|---|---|
| Piper Lessac | 60 MB | 1 | MIT | Clear English, default |
| Piper Amy | 60 MB | 1 | MIT | Warm female voice |
| KittenTTS Nano | 90 MB | 8 | Apache 2.0 | 4M/4F voices |
| Matcha LJSpeech | 100 MB | 1 | MIT | HiFi-GAN vocoder |
| Kokoro English v0.19 | 310 MB | 11 | Apache 2.0 | Best English quality |
| Kokoro Multi-lang v1.1 | 500 MB | 103 | Apache 2.0 | Chinese + English |
Multi-Speaker Voices
KittenTTS and Kokoro support multiple speakers. Configure via~/.rcli/config/tts.json:
Cleanup Unused Models
Enter to delete selected model:
- Active models — Cannot be deleted (switch first)
- Inactive models — Deleted immediately
Engine Info
Model Storage Locations
Model Selection Persistence
User preferences are saved to:Benchmarking Models
Compare All LLMs
Compare All TTS
Troubleshooting
Model Download Fails
Out of Disk Space
rcli cleanup to free space
Model Not Found After Download
rcli models, download again