Quick Start
All model changes persist across sessions. Your selections are saved in
~/Library/RCLI/config.Hot-Swapping LLMs
In the TUI
While RCLI is running:- Press M to open the Models panel
- Navigate to the LLM section
- Select a model with arrow keys
- Press Enter to activate
Context preserved: Conversation history remains intact. System prompt KV cache is regenerated for the new model.
Via CLI
Guided Upgrade
- All available LLMs
- Size and speed comparisons
- Download progress (if not installed)
- Automatic activation after download
Manual Model Download
Models are stored in~/Library/RCLI/models/. RCLI auto-detects any GGUF files in this directory.
To manually add a model:
Direct Config Edit
rcli launch or via M panel in TUI.
Switching STT Models
Two STT Categories
- Streaming STT (Zipformer) — Always active, cannot be changed
- Offline STT (Whisper/Parakeet) — User-switchable
Upgrade to Parakeet
Switch Between Offline Models
Via TUI
- Press M → Navigate to STT section
- Select
whisper-baseorparakeet-tdt - Press Enter to activate
Via Config
Check Active STT
Switching TTS Voices
Interactive Voice Browser
- All available voices
- Quality ratings
- Speaker counts
- Download size
- Installation status
Via TUI
- Press M → Navigate to TTS section
- Select a voice (e.g.,
kokoro-en) - Press Enter to activate
Multi-Speaker Voice Selection
For voices with multiple speakers (KittenTTS, Kokoro):Check Active Voice
Model Selection Priority
RCLI uses this order to determine the active model:1. User Preference (Config)
If a model is specified in~/Library/RCLI/config, it takes priority:
2. Auto-Detect (Highest Priority)
If no config exists, RCLI selects the highest-priority installed model:| Model Type | Priority Ranking (highest first) |
|---|---|
| LLM | Qwen3.5 4B > Qwen3 4B > LFM2 2.6B > Qwen3.5 2B > LFM2.5 1.2B > … |
| STT (Offline) | Parakeet TDT > Whisper base.en |
| TTS | Kokoro Multi > Kokoro EN > KittenTTS > Matcha > Piper Amy > Piper Lessac |
3. Fallback to Default
If no models are installed, RCLI prompts:Persistence Across Sessions
Config File Format
Stored in~/Library/RCLI/config:
Clearing Selection
To reset to auto-detect:Model Download Management
Check Installed Models
Download New Models
Via Interactive CLI
Via Direct Download
Manually download GGUF/ONNX files to~/Library/RCLI/models/:
Remove Unused Models
- All installed models
- Disk space used
- Active vs. inactive models
- Safe deletion with confirmation
Model Auto-Detection
RCLI automatically detects model families from filenames:LLM Detection
| Filename Pattern | Detected Family | Tool Format | ||
|---|---|---|---|---|
lfm2-*.gguf | Liquid LFM2 | `< | tool_call_start | >[func(arg=“val”)]` |
qwen3-*.gguf | Qwen3 | <tool_call>{"name": "...", "arguments": {...}}</tool_call> | ||
qwen3.5-*.gguf | Qwen3 | Same as Qwen3 | ||
llama-3*.gguf | Llama3 | `< | python_tag | >` |
gemma*.gguf | Gemma | tool_call ``` | ||
mistral*.gguf | Mistral | [TOOL_CALLS] |
STT/TTS Detection
Models are detected by directory structure in~/Library/RCLI/models/:
Performance Impact
Model Swap Latency
| Operation | Latency | Impact |
|---|---|---|
| LLM swap | ~500-1500ms | Loads new model, regenerates system KV cache |
| STT swap | ~200-400ms | Reloads ONNX model |
| TTS swap | ~100-200ms | Reloads ONNX model |
No restart required. All swaps happen in the background while the TUI remains interactive.
Memory Usage
Only one model per type is loaded in memory at a time:| Model Type | Active Memory | Example |
|---|---|---|
| LLM | ~1-4 GB | Qwen3.5 4B: ~3.2 GB |
| STT (Streaming) | ~120 MB | Zipformer |
| STT (Offline) | ~300 MB | Parakeet TDT |
| TTS | ~180 MB | Kokoro English |
| VAD | ~8 MB | Silero VAD |
| Embeddings | ~50 MB | Snowflake Arctic |
CLI Reference
Model Management
Upgrade Commands
Model Info
Config Location
Best Practices
1. Start with Defaults
2. Upgrade Incrementally
Upgrade one model type at a time:3. Benchmark After Swaps
4. Monitor Disk Space
5. Keep Config Clean
Periodically review~/Library/RCLI/config and remove stale entries.
Troubleshooting
Model Not Switching
Symptom: Selected model doesn’t become active. Solution:- Check model is installed:
ls ~/Library/RCLI/models/ - Verify config:
cat ~/Library/RCLI/config - Check file permissions:
ls -la ~/Library/RCLI/models/ - Re-download model:
rcli models
Model Download Fails
Symptom: Download stalls or errors. Solution:- Check internet connection
- Verify HuggingFace/GitHub access
- Manually download and extract to
~/Library/RCLI/models/ - Check available disk space:
df -h
Wrong Model Detected
Symptom: RCLI uses a different model than expected. Solution:- Check config:
cat ~/Library/RCLI/config - Verify model ID matches filename
- Use exact model ID in config (e.g.,
model=qwen3.5-4b) - Restart RCLI:
rcli
Memory Issues After Swap
Symptom: High RAM usage or slowness after model swap. Solution:- Restart RCLI to clear all caches
- Use smaller model (e.g., Qwen3.5 2B instead of 4B)
- Check Activity Monitor for memory leaks
- Reduce context size:
--ctx-size 2048
Next Steps
LLM Models
Compare all 9 LLM models with specs
STT Models
Explore Zipformer, Whisper, and Parakeet
TTS Models
Browse 6 TTS voices with quality ratings
Benchmarks
Measure performance on your Mac