Skip to main content
RCLI supports hot-swapping models at runtime without restarting the pipeline. Switch between LLMs, STT models, and TTS voices instantly while preserving conversation context.

Quick Start

rcli models          # Interactive model browser (all types)
rcli upgrade-llm     # Guided LLM upgrade
rcli upgrade-stt     # Upgrade to Parakeet TDT
rcli voices          # TTS voice browser
All model changes persist across sessions. Your selections are saved in ~/Library/RCLI/config.

Hot-Swapping LLMs

In the TUI

While RCLI is running:
  1. Press M to open the Models panel
  2. Navigate to the LLM section
  3. Select a model with arrow keys
  4. Press Enter to activate
The model switches immediately without restarting the TUI.
Context preserved: Conversation history remains intact. System prompt KV cache is regenerated for the new model.

Via CLI

Guided Upgrade

rcli upgrade-llm
Interactive menu showing:
  • All available LLMs
  • Size and speed comparisons
  • Download progress (if not installed)
  • Automatic activation after download

Manual Model Download

Models are stored in ~/Library/RCLI/models/. RCLI auto-detects any GGUF files in this directory. To manually add a model:
cd ~/Library/RCLI/models/
wget https://huggingface.co/Qwen/Qwen3.5-4B-GGUF/resolve/main/Qwen3.5-4B-Q4_K_M.gguf
RCLI will auto-detect the model on next launch.

Direct Config Edit

# ~/.Library/RCLI/config
model=qwen3.5-4b
Changes take effect on next rcli launch or via M panel in TUI.

Switching STT Models

Two STT Categories

  1. Streaming STT (Zipformer) — Always active, cannot be changed
  2. Offline STT (Whisper/Parakeet) — User-switchable

Upgrade to Parakeet

rcli upgrade-stt  # Downloads Parakeet TDT 0.6B (~640 MB)
After download, Parakeet becomes active immediately (highest priority).

Switch Between Offline Models

Via TUI

  1. Press M → Navigate to STT section
  2. Select whisper-base or parakeet-tdt
  3. Press Enter to activate

Via Config

# ~/.Library/RCLI/config
stt_model=whisper-base    # or parakeet-tdt

Check Active STT

rcli info  # Shows active streaming + offline STT
Example output:
STT (Streaming): Zipformer
STT (Offline): Parakeet TDT 0.6B v3

Switching TTS Voices

Interactive Voice Browser

rcli voices
Menu shows:
  • All available voices
  • Quality ratings
  • Speaker counts
  • Download size
  • Installation status
Select a voice and press Enter to download/activate.

Via TUI

  1. Press M → Navigate to TTS section
  2. Select a voice (e.g., kokoro-en)
  3. Press Enter to activate
Voice changes take effect on the next TTS synthesis.

Multi-Speaker Voice Selection

For voices with multiple speakers (KittenTTS, Kokoro):
# ~/.Library/RCLI/config
tts_model=kokoro-en
tts_speaker=5        # Speaker IDs: 0-10 for Kokoro English
Speaker changes apply to next synthesis.

Check Active Voice

rcli info  # Shows active TTS voice and speaker ID
Example output:
TTS: Kokoro English v0.19 (speaker 5)

Model Selection Priority

RCLI uses this order to determine the active model:

1. User Preference (Config)

If a model is specified in ~/Library/RCLI/config, it takes priority:
model=qwen3.5-4b
tts_model=kokoro-en
stt_model=parakeet-tdt

2. Auto-Detect (Highest Priority)

If no config exists, RCLI selects the highest-priority installed model:
Model TypePriority Ranking (highest first)
LLMQwen3.5 4B > Qwen3 4B > LFM2 2.6B > Qwen3.5 2B > LFM2.5 1.2B > …
STT (Offline)Parakeet TDT > Whisper base.en
TTSKokoro Multi > Kokoro EN > KittenTTS > Matcha > Piper Amy > Piper Lessac

3. Fallback to Default

If no models are installed, RCLI prompts:
rcli setup  # Download default models (~1 GB)

Persistence Across Sessions

Config File Format

Stored in ~/Library/RCLI/config:
model=qwen3.5-4b
tts_model=kokoro-en
tts_speaker=5
stt_model=parakeet-tdt
Each key-value pair persists across launches.

Clearing Selection

To reset to auto-detect:
# Remove specific keys from ~/Library/RCLI/config
model=           # Empty value = auto-detect
Or delete the entire config file:
rm ~/Library/RCLI/config
RCLI will regenerate defaults on next launch.

Model Download Management

Check Installed Models

rcli info  # Lists all installed models with sizes
Example output:
Installed LLMs:
  - Liquid LFM2 1.2B Tool (731 MB) [ACTIVE]
  - Qwen3.5 4B (2.7 GB)
  
Installed TTS:
  - Piper Lessac (60 MB)
  - Kokoro English v0.19 (310 MB) [ACTIVE]

Download New Models

Via Interactive CLI

rcli models       # Browse all models, download with Enter
rcli upgrade-llm  # LLM-specific guided upgrade
rcli upgrade-stt  # Download Parakeet TDT
rcli voices       # TTS voice browser

Via Direct Download

Manually download GGUF/ONNX files to ~/Library/RCLI/models/:
cd ~/Library/RCLI/models/

# Download LLM (GGUF format)
wget https://huggingface.co/.../model.gguf

# Download TTS (tar.bz2 archive)
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
tar -xf kokoro-en-v0_19.tar.bz2
RCLI auto-detects models on next launch.

Remove Unused Models

rcli cleanup  # Interactive cleanup panel
Shows:
  • All installed models
  • Disk space used
  • Active vs. inactive models
  • Safe deletion with confirmation
Or manually delete:
rm ~/Library/RCLI/models/qwen3-4b-q4_k_m.gguf
rm -rf ~/Library/RCLI/models/kokoro-multi-lang-v1_1/

Model Auto-Detection

RCLI automatically detects model families from filenames:

LLM Detection

Filename PatternDetected FamilyTool Format
lfm2-*.ggufLiquid LFM2`<tool_call_start>[func(arg=“val”)]`
qwen3-*.ggufQwen3&lt;tool_call&gt;{"name": "...", "arguments": {...}}&lt;/tool_call&gt;
qwen3.5-*.ggufQwen3Same as Qwen3
llama-3*.ggufLlama3`<python_tag>`
gemma*.ggufGemma tool_call ```
mistral*.ggufMistral[TOOL_CALLS]
Fallback: ChatML format if family is unknown.

STT/TTS Detection

Models are detected by directory structure in ~/Library/RCLI/models/:
models/
├── zipformer/               # Auto-detected as Zipformer STT
├── whisper-base.en/         # Auto-detected as Whisper STT
├── parakeet-tdt/            # Auto-detected as Parakeet STT
├── piper-voice/             # Auto-detected as Piper TTS
└── kokoro-en-v0_19/         # Auto-detected as Kokoro TTS

Performance Impact

Model Swap Latency

OperationLatencyImpact
LLM swap~500-1500msLoads new model, regenerates system KV cache
STT swap~200-400msReloads ONNX model
TTS swap~100-200msReloads ONNX model
No restart required. All swaps happen in the background while the TUI remains interactive.

Memory Usage

Only one model per type is loaded in memory at a time:
Model TypeActive MemoryExample
LLM~1-4 GBQwen3.5 4B: ~3.2 GB
STT (Streaming)~120 MBZipformer
STT (Offline)~300 MBParakeet TDT
TTS~180 MBKokoro English
VAD~8 MBSilero VAD
Embeddings~50 MBSnowflake Arctic
Total RAM usage: ~2-5 GB depending on LLM size.

CLI Reference

Model Management

rcli models              # Interactive model browser (all types)
rcli models llm          # Jump to LLM section
rcli models stt          # Jump to STT section
rcli models tts          # Jump to TTS section

Upgrade Commands

rcli upgrade-llm         # Guided LLM upgrade with size/speed comparison
rcli upgrade-stt         # Upgrade to Parakeet TDT (~640 MB download)
rcli voices              # TTS voice browser and switcher

Model Info

rcli info                # Show active models and engine info
rcli cleanup             # Remove unused models to free disk space

Config Location

cat ~/Library/RCLI/config            # View current config
open ~/Library/RCLI/                 # Open models directory in Finder

Best Practices

1. Start with Defaults

rcli setup  # Download default model set (~1 GB)
Test the pipeline before upgrading.

2. Upgrade Incrementally

Upgrade one model type at a time:
rcli upgrade-llm    # Try Qwen3.5 4B first
rcli upgrade-stt    # Then Parakeet TDT
rcli voices         # Finally Kokoro English

3. Benchmark After Swaps

rcli bench --suite llm      # Benchmark new LLM
rcli bench --suite tools    # Test tool-calling accuracy
rcli bench --suite e2e      # Measure end-to-end latency

4. Monitor Disk Space

du -sh ~/Library/RCLI/models  # Check total model size
rcli cleanup                  # Remove unused models

5. Keep Config Clean

Periodically review ~/Library/RCLI/config and remove stale entries.

Troubleshooting

Model Not Switching

Symptom: Selected model doesn’t become active. Solution:
  1. Check model is installed: ls ~/Library/RCLI/models/
  2. Verify config: cat ~/Library/RCLI/config
  3. Check file permissions: ls -la ~/Library/RCLI/models/
  4. Re-download model: rcli models

Model Download Fails

Symptom: Download stalls or errors. Solution:
  1. Check internet connection
  2. Verify HuggingFace/GitHub access
  3. Manually download and extract to ~/Library/RCLI/models/
  4. Check available disk space: df -h

Wrong Model Detected

Symptom: RCLI uses a different model than expected. Solution:
  1. Check config: cat ~/Library/RCLI/config
  2. Verify model ID matches filename
  3. Use exact model ID in config (e.g., model=qwen3.5-4b)
  4. Restart RCLI: rcli

Memory Issues After Swap

Symptom: High RAM usage or slowness after model swap. Solution:
  1. Restart RCLI to clear all caches
  2. Use smaller model (e.g., Qwen3.5 2B instead of 4B)
  3. Check Activity Monitor for memory leaks
  4. Reduce context size: --ctx-size 2048

Next Steps

LLM Models

Compare all 9 LLM models with specs

STT Models

Explore Zipformer, Whisper, and Parakeet

TTS Models

Browse 6 TTS voices with quality ratings

Benchmarks

Measure performance on your Mac

Build docs developers (and LLMs) love