Switching Models

RCLI supports hot-swapping models at runtime without restarting the pipeline. Switch between LLMs, STT models, and TTS voices instantly while preserving conversation context.

Quick Start

rcli models          # Interactive model browser (all types)
rcli upgrade-llm     # Guided LLM upgrade
rcli upgrade-stt     # Upgrade to Parakeet TDT
rcli voices          # TTS voice browser

All model changes persist across sessions. Your selections are saved in ~/Library/RCLI/config.

Hot-Swapping LLMs

In the TUI

While RCLI is running:

Press M to open the Models panel
Navigate to the LLM section
Select a model with arrow keys
Press Enter to activate

The model switches immediately without restarting the TUI.

Context preserved: Conversation history remains intact. System prompt KV cache is regenerated for the new model.

Via CLI

Guided Upgrade

rcli upgrade-llm

Interactive menu showing:

All available LLMs
Size and speed comparisons
Download progress (if not installed)
Automatic activation after download

Manual Model Download

Models are stored in ~/Library/RCLI/models/. RCLI auto-detects any GGUF files in this directory. To manually add a model:

cd ~/Library/RCLI/models/
wget https://huggingface.co/Qwen/Qwen3.5-4B-GGUF/resolve/main/Qwen3.5-4B-Q4_K_M.gguf

RCLI will auto-detect the model on next launch.

Direct Config Edit

# ~/.Library/RCLI/config
model=qwen3.5-4b

Changes take effect on next rcli launch or via M panel in TUI.

Switching STT Models

Two STT Categories

Streaming STT (Zipformer) — Always active, cannot be changed
Offline STT (Whisper/Parakeet) — User-switchable

Upgrade to Parakeet

rcli upgrade-stt  # Downloads Parakeet TDT 0.6B (~640 MB)

After download, Parakeet becomes active immediately (highest priority).

Switch Between Offline Models

Via TUI

Press M → Navigate to STT section
Select whisper-base or parakeet-tdt
Press Enter to activate

Via Config

# ~/.Library/RCLI/config
stt_model=whisper-base    # or parakeet-tdt

Check Active STT

rcli info  # Shows active streaming + offline STT

Example output:

STT (Streaming): Zipformer
STT (Offline): Parakeet TDT 0.6B v3

Switching TTS Voices

Interactive Voice Browser

rcli voices

Menu shows:

All available voices
Quality ratings
Speaker counts
Download size
Installation status

Select a voice and press Enter to download/activate.

Via TUI

Press M → Navigate to TTS section
Select a voice (e.g., kokoro-en)
Press Enter to activate

Voice changes take effect on the next TTS synthesis.

Multi-Speaker Voice Selection

For voices with multiple speakers (KittenTTS, Kokoro):

# ~/.Library/RCLI/config
tts_model=kokoro-en
tts_speaker=5        # Speaker IDs: 0-10 for Kokoro English

Speaker changes apply to next synthesis.

Check Active Voice

rcli info  # Shows active TTS voice and speaker ID

Example output:

TTS: Kokoro English v0.19 (speaker 5)

Model Selection Priority

RCLI uses this order to determine the active model:

1. User Preference (Config)

If a model is specified in ~/Library/RCLI/config, it takes priority:

model=qwen3.5-4b
tts_model=kokoro-en
stt_model=parakeet-tdt

2. Auto-Detect (Highest Priority)

If no config exists, RCLI selects the highest-priority installed model:

Model Type	Priority Ranking (highest first)
LLM	Qwen3.5 4B > Qwen3 4B > LFM2 2.6B > Qwen3.5 2B > LFM2.5 1.2B > …
STT (Offline)	Parakeet TDT > Whisper base.en
TTS	Kokoro Multi > Kokoro EN > KittenTTS > Matcha > Piper Amy > Piper Lessac

3. Fallback to Default

If no models are installed, RCLI prompts:

rcli setup  # Download default models (~1 GB)

Persistence Across Sessions

Config File Format

Stored in ~/Library/RCLI/config:

model=qwen3.5-4b
tts_model=kokoro-en
tts_speaker=5
stt_model=parakeet-tdt

Each key-value pair persists across launches.

Clearing Selection

To reset to auto-detect:

# Remove specific keys from ~/Library/RCLI/config
model=           # Empty value = auto-detect

Or delete the entire config file:

rm ~/Library/RCLI/config

RCLI will regenerate defaults on next launch.

Model Download Management

Check Installed Models

rcli info  # Lists all installed models with sizes

Example output:

Installed LLMs:
  - Liquid LFM2 1.2B Tool (731 MB) [ACTIVE]
  - Qwen3.5 4B (2.7 GB)
  
Installed TTS:
  - Piper Lessac (60 MB)
  - Kokoro English v0.19 (310 MB) [ACTIVE]

Download New Models

Via Interactive CLI

rcli models       # Browse all models, download with Enter
rcli upgrade-llm  # LLM-specific guided upgrade
rcli upgrade-stt  # Download Parakeet TDT
rcli voices       # TTS voice browser

Via Direct Download

Manually download GGUF/ONNX files to ~/Library/RCLI/models/:

cd ~/Library/RCLI/models/

# Download LLM (GGUF format)
wget https://huggingface.co/.../model.gguf

# Download TTS (tar.bz2 archive)
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
tar -xf kokoro-en-v0_19.tar.bz2

RCLI auto-detects models on next launch.

Remove Unused Models

rcli cleanup  # Interactive cleanup panel

Shows:

All installed models
Disk space used
Active vs. inactive models
Safe deletion with confirmation

Or manually delete:

rm ~/Library/RCLI/models/qwen3-4b-q4_k_m.gguf
rm -rf ~/Library/RCLI/models/kokoro-multi-lang-v1_1/

Model Auto-Detection

RCLI automatically detects model families from filenames:

LLM Detection

Filename Pattern	Detected Family	Tool Format
`lfm2-*.gguf`	Liquid LFM2	`<	tool_call_start	>[func(arg=“val”)]`
`qwen3-*.gguf`	Qwen3	`<tool_call>{"name": "...", "arguments": {...}}</tool_call>`
`qwen3.5-*.gguf`	Qwen3	Same as Qwen3
`llama-3*.gguf`	Llama3	`<	python_tag	>`
`gemma*.gguf`	Gemma	tool_call ```
`mistral*.gguf`	Mistral	`[TOOL_CALLS]`

Fallback: ChatML format if family is unknown.

STT/TTS Detection

Models are detected by directory structure in ~/Library/RCLI/models/:

models/
├── zipformer/               # Auto-detected as Zipformer STT
├── whisper-base.en/         # Auto-detected as Whisper STT
├── parakeet-tdt/            # Auto-detected as Parakeet STT
├── piper-voice/             # Auto-detected as Piper TTS
└── kokoro-en-v0_19/         # Auto-detected as Kokoro TTS

Performance Impact

Model Swap Latency

Operation	Latency	Impact
LLM swap	~500-1500ms	Loads new model, regenerates system KV cache
STT swap	~200-400ms	Reloads ONNX model
TTS swap	~100-200ms	Reloads ONNX model

No restart required. All swaps happen in the background while the TUI remains interactive.

Memory Usage

Only one model per type is loaded in memory at a time:

Model Type	Active Memory	Example
LLM	~1-4 GB	Qwen3.5 4B: ~3.2 GB
STT (Streaming)	~120 MB	Zipformer
STT (Offline)	~300 MB	Parakeet TDT
TTS	~180 MB	Kokoro English
VAD	~8 MB	Silero VAD
Embeddings	~50 MB	Snowflake Arctic

Total RAM usage: ~2-5 GB depending on LLM size.

CLI Reference

Model Management

rcli models              # Interactive model browser (all types)
rcli models llm          # Jump to LLM section
rcli models stt          # Jump to STT section
rcli models tts          # Jump to TTS section

Upgrade Commands

rcli upgrade-llm         # Guided LLM upgrade with size/speed comparison
rcli upgrade-stt         # Upgrade to Parakeet TDT (~640 MB download)
rcli voices              # TTS voice browser and switcher

Model Info

rcli info                # Show active models and engine info
rcli cleanup             # Remove unused models to free disk space

Config Location

cat ~/Library/RCLI/config            # View current config
open ~/Library/RCLI/                 # Open models directory in Finder

Best Practices

1. Start with Defaults

rcli setup  # Download default model set (~1 GB)

Test the pipeline before upgrading.

2. Upgrade Incrementally

Upgrade one model type at a time:

rcli upgrade-llm    # Try Qwen3.5 4B first
rcli upgrade-stt    # Then Parakeet TDT
rcli voices         # Finally Kokoro English

3. Benchmark After Swaps

rcli bench --suite llm      # Benchmark new LLM
rcli bench --suite tools    # Test tool-calling accuracy
rcli bench --suite e2e      # Measure end-to-end latency

4. Monitor Disk Space

du -sh ~/Library/RCLI/models  # Check total model size
rcli cleanup                  # Remove unused models

5. Keep Config Clean

Periodically review ~/Library/RCLI/config and remove stale entries.

Troubleshooting

Model Not Switching

Symptom: Selected model doesn’t become active. Solution:

Check model is installed: ls ~/Library/RCLI/models/
Verify config: cat ~/Library/RCLI/config
Check file permissions: ls -la ~/Library/RCLI/models/
Re-download model: rcli models

Model Download Fails

Symptom: Download stalls or errors. Solution:

Check internet connection
Verify HuggingFace/GitHub access
Manually download and extract to ~/Library/RCLI/models/
Check available disk space: df -h

Wrong Model Detected

Symptom: RCLI uses a different model than expected. Solution:

Check config: cat ~/Library/RCLI/config
Verify model ID matches filename
Use exact model ID in config (e.g., model=qwen3.5-4b)
Restart RCLI: rcli

Memory Issues After Swap

Symptom: High RAM usage or slowness after model swap. Solution:

Restart RCLI to clear all caches
Use smaller model (e.g., Qwen3.5 2B instead of 4B)
Check Activity Monitor for memory leaks
Reduce context size: --ctx-size 2048

Next Steps

LLM Models

Compare all 9 LLM models with specs

STT Models

Explore Zipformer, Whisper, and Parakeet

TTS Models

Browse 6 TTS voices with quality ratings

Benchmarks

Measure performance on your Mac

Get Started

Core Features

Commands

Models

Actions

Advanced

Development

​Quick Start

​Hot-Swapping LLMs

​In the TUI

​Via CLI

​Guided Upgrade

​Manual Model Download

​Direct Config Edit

​Switching STT Models

​Two STT Categories

​Upgrade to Parakeet

​Switch Between Offline Models

​Via TUI

​Via Config

​Check Active STT

​Switching TTS Voices

​Interactive Voice Browser

​Via TUI

​Multi-Speaker Voice Selection

​Check Active Voice

​Model Selection Priority

​1. User Preference (Config)

​2. Auto-Detect (Highest Priority)

​3. Fallback to Default

​Persistence Across Sessions

​Config File Format

​Clearing Selection

​Model Download Management

​Check Installed Models

​Download New Models

​Via Interactive CLI

​Via Direct Download

​Remove Unused Models

​Model Auto-Detection

​LLM Detection

​STT/TTS Detection

​Performance Impact

​Model Swap Latency

​Memory Usage

​CLI Reference

​Model Management

​Upgrade Commands

​Model Info

​Config Location

​Best Practices

​1. Start with Defaults

​2. Upgrade Incrementally

​3. Benchmark After Swaps

​4. Monitor Disk Space

​5. Keep Config Clean

​Troubleshooting

​Model Not Switching

​Model Download Fails

​Wrong Model Detected

​Memory Issues After Swap

​Next Steps

LLM Models

STT Models

TTS Models

Benchmarks

Build docs developers (and LLMs) love

Quick Start

Hot-Swapping LLMs

In the TUI

Via CLI

Guided Upgrade

Manual Model Download

Direct Config Edit

Switching STT Models

Two STT Categories

Upgrade to Parakeet

Switch Between Offline Models

Via TUI

Via Config

Check Active STT

Switching TTS Voices

Interactive Voice Browser

Via TUI

Multi-Speaker Voice Selection

Check Active Voice

Model Selection Priority

1. User Preference (Config)

2. Auto-Detect (Highest Priority)

3. Fallback to Default

Persistence Across Sessions

Config File Format

Clearing Selection

Model Download Management

Check Installed Models

Download New Models

Via Interactive CLI

Via Direct Download

Remove Unused Models

Model Auto-Detection

LLM Detection

STT/TTS Detection

Performance Impact

Model Swap Latency

Memory Usage

CLI Reference

Model Management

Upgrade Commands

Model Info

Config Location

Best Practices

1. Start with Defaults

2. Upgrade Incrementally

3. Benchmark After Swaps

4. Monitor Disk Space

5. Keep Config Clean

Troubleshooting

Model Not Switching

Model Download Fails

Wrong Model Detected

Memory Issues After Swap

Next Steps