Troubleshooting

Common Issues

Microphone Permissions

Symptom: No audio captured, STT returns empty transcripts Cause: macOS requires explicit microphone permissions for Terminal apps. Fix:

Open System Settings → Privacy & Security → Microphone
Enable microphone access for:
- Terminal.app (if running from Terminal)
- iTerm.app (if using iTerm2)
- Your terminal emulator

Restart terminal and try again:

rcli mic-test  # Test microphone capture

If you installed via Homebrew, the rcli binary is in /opt/homebrew/bin/. macOS prompts for permissions on first microphone access.

Out of Memory (OOM)

Symptom: rcli crashes with “Killed: 9” or “malloc failed” Cause: Model too large for available RAM, or GPU layers too high. Fix:

Reduce GPU layers:

rcli --gpu-layers 20  # Use hybrid CPU/GPU

Use smaller model:

rcli models llm       # Switch to Qwen3 0.6B or LFM2 350M

Reduce context size:

rcli --ctx-size 2048  # Smaller context = less KV cache memory

Disable mlock:
```
export RCLI_LLM_USE_MLOCK=0
rcli
```

Slow Inference

Symptom: LLM generates <50 tok/s, high latency Cause: CPU-only inference, or insufficient GPU layers. Fix:

Check GPU layers:
```
rcli info  # Look for "GPU layers: 99"
```
Enable Metal GPU:
```
export RCLI_LLM_GPU_LAYERS=99
rcli
```

Reduce thread count (if GPU-enabled):

export RCLI_LLM_THREADS=1  # GPU-bound: 1 thread is optimal

Enable Flash Attention:
```
export RCLI_LLM_FLASH_ATTN=1
```

STT Transcription Errors

Symptom: STT returns gibberish or empty text Possible Causes:

Background noise: VAD filters out speech
```
rcli mic-test  # Check audio levels
```

Silence threshold too low:

export RCLI_STT_SILENCE_MS=1200  # Increase from 800ms

Wrong sample rate:

export RCLI_STT_SAMPLE_RATE=16000  # Must be 16kHz

Model mismatch: Try switching STT models:

rcli upgrade-stt  # Install Parakeet TDT (higher accuracy)

TTS Audio Glitches

Symptom: Choppy playback, crackling, or silence Possible Causes:

Ring buffer underrun: Playback faster than synthesis

export RCLI_AUDIO_PLAYBACK_BUFFER=88064  # Double buffer size

CPU throttling: Check Activity Monitor
- Quit other apps to free CPU/GPU

Sample rate mismatch:

export RCLI_TTS_SAMPLE_RATE=22050  # Piper default

Voice model corruption:

rcli cleanup  # Remove corrupted models
rcli voices   # Re-download voice

Tool Calling Failures

Symptom: LLM doesn’t execute actions, or parses wrong tool Possible Causes:

Model doesn’t support tool calling: Use LFM2 1.2B Tool or Qwen3.5 2B+
```
rcli models llm  # Switch to tool-capable model
```

Tool definitions not loaded:

rcli -v  # Check logs for "Tool definitions: 43 actions"

Action disabled:
```
rcli actions enable open_app
```

Parse errors: Enable tool trace to debug:

# In TUI: Press T to toggle tool trace
# Shows: Tool call → Arguments → Execution result

RAG Retrieval Errors

Symptom: rcli ask --rag returns “Index not found” Fix:

Build index first:
```
rcli rag ingest ~/Documents/notes
```

Check index path:

ls ~/Library/RCLI/index/
# Should contain: chunks.bin, metadata.json, usearch.index

Specify index explicitly:

rcli ask --rag ~/Library/RCLI/index "What is the project deadline?"

Model Download Failures

Symptom: rcli setup or rcli models fails to download models Possible Causes:

Network timeout:

export RCLI_DOWNLOAD_TIMEOUT=600  # 10 min timeout
rcli setup

Disk space:

df -h ~/Library/RCLI/models/  # Check free space
rcli cleanup                   # Remove unused models

Hugging Face rate limit: Wait 5-10 min and retry

Corrupted download:

rm -rf ~/Library/RCLI/models/llm/qwen3-0.6b-q4_k_m.gguf
rcli models llm  # Re-download

Debugging

Enable Verbose Logging

rcli -v
# or
rcli --verbose

Outputs:

[HW] platform=macos cpu=14(p=10 e=4) ram=36864MB gpu_layers=99
[Pool] Allocated 128MB memory pool
[STT] "open Safari" (final, 1.2s audio, 43.7ms)
[LLM] first_token=22.5ms total=156.3ms 159.6 tok/s
[TTS] Synthesizing: "Opening Safari now."
[Pipeline] E2E latency: 131ms

Debug Logging

export RCLI_LOG_LEVEL=DEBUG
rcli

Outputs detailed logs:

Memory pool allocations
Ring buffer read/write operations
KV cache hits/misses
Tool call parsing steps

Redirect Logs to File

rcli 2> rcli.log  # Stderr to file
tail -f rcli.log  # Watch logs live

Tool Call Trace

In TUI, press T to toggle tool call trace. Shows:

> open Safari
  ~ [TRACE] Tool call: open_app({"app_name": "Safari"})
  ~ [TRACE] open_app -> OK: {"success": true, "output": "Opened Safari"}
  RCLI: Done! Safari is now open.

Or via CLI:

rcli bench --suite tools  # Test tool calling accuracy

Inspect Memory Pool

rcli info

Outputs:

Memory pool: 128 MB allocated
Used: 58.2 MB (45.5%)
High-water mark: 59.1 MB
Ring buffers:
  - Capture:  16384 samples (65 KB)
  - Playback: 44032 samples (172 KB)

Profile Performance

rcli bench --suite all --output profile.json
cat profile.json | jq '.llm'

Outputs:

{
  "model": "qwen3-0.6b",
  "first_token_ms": 22.5,
  "throughput_tps": 159.6,
  "kv_cache_hit_rate": 0.98,
  "memory_mb": 584
}

Error Messages

MemoryPool: out of memory

Cause: Pre-allocated memory pool exhausted.Fix:

export RCLI_POOL_SIZE_MB=256  # Increase pool size
rcli

Or reduce audio buffer sizes:

export RCLI_AUDIO_CAPTURE_BUFFER=8192
export RCLI_AUDIO_PLAYBACK_BUFFER=22016

Failed to init LLM model

Possible Causes:

Corrupted model file:

rm ~/Library/RCLI/models/llm/*.gguf
rcli models llm  # Re-download

Incompatible GGUF version:

# Update llama.cpp (rebuild from source)
cd deps/llama.cpp && git pull && cd ../..
rm -rf build && mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . -j$(sysctl -n hw.ncpu)

GPU layers too high:

rcli --gpu-layers 0  # CPU-only fallback

Audio init failed

Cause: CoreAudio device not found or permissions denied.Fix:

Check microphone permissions (see above)
Test microphone:
```
rcli mic-test
```

Check audio device:

# macOS System Settings → Sound → Input
# Ensure correct device selected

Restart CoreAudio:
```
sudo killall coreaudiod
```

VAD init failed (will process all audio)

Impact: Non-fatal warning. Pipeline continues without VAD.Effect: All audio (including silence) sent to STT. May produce phantom transcripts.Fix:

# Re-download VAD model
rm ~/Library/RCLI/models/vad/silero_vad.onnx
rcli setup

Offline STT init failed (will use streaming STT)

Impact: Non-fatal warning. Falls back to Zipformer.Effect: Lower STT accuracy (Zipformer ~8% WER vs Whisper ~5% WER).Fix:

# Install Whisper or Parakeet
rcli models stt  # Select whisper-base.en or parakeet-tdt

KV cache full, clearing...

Cause: Context window exceeded. Conversation history too long.Effect: KV cache cleared, next response slower (no cache reuse).Fix:

# Increase context size
rcli --ctx-size 8192

# Or reduce conversation history turns
export RCLI_CONVERSATION_HISTORY_TURNS=5

Platform-Specific Issues

macOS Ventura (13.0+)

Issue: “Operation not permitted” errors Fix: Grant Full Disk Access to Terminal:

System Settings → Privacy & Security → Full Disk Access
Add Terminal.app or iTerm.app
Restart terminal

macOS Sonoma (14.0+)

Issue: Metal shader compilation warnings Fix: Update to latest macOS patch (Metal shader cache rebuild)

Apple Silicon Rosetta

Issue: Running x86_64 build on ARM64 Fix: Install ARM64 native build:

arch -arm64 brew install rcli

Verify:

file $(which rcli)  # Should show "arm64"

Performance Debugging

High CPU Usage

Symptom: CPU at 100%, fan spinning Possible Causes:

CPU-only inference: Enable GPU layers
```
rcli --gpu-layers 99
```

Too many threads:

export RCLI_LLM_THREADS=1
export RCLI_LLM_THREADS_BATCH=4

Large batch size:

export RCLI_LLM_BATCH=1024  # Reduce from 2048

High Memory Usage

Symptom: Memory pressure, swap usage Possible Causes:

Large KV cache:
```
rcli --ctx-size 2048  # Reduce context
```

mlock enabled:

export RCLI_LLM_USE_MLOCK=0  # Disable weight pinning

Multiple models loaded:
```
rcli cleanup  # Remove unused models
```

Large memory pool:

export RCLI_POOL_SIZE_MB=64  # Reduce from 128

GPU Not Utilized

Symptom: GPU idle, slow inference Check:

rcli info  # Look for "GPU layers: 0" (bad) vs "GPU layers: 99" (good)

Fix:

export RCLI_LLM_GPU_LAYERS=99
rcli

Ring Buffer Overruns

Symptom: Choppy audio, dropped samples Check logs:

[WARN] Ring buffer overrun: 512 samples dropped

Fix:

# Increase buffer sizes
export RCLI_AUDIO_CAPTURE_BUFFER=32768
export RCLI_AUDIO_PLAYBACK_BUFFER=88064

Crash Reports

Generate Crash Report

If rcli crashes, macOS generates a crash report:

open ~/Library/Logs/DiagnosticReports/
# Look for rcli_*.crash or rcli_*.ips

Useful Info for Bug Reports

rcli info > system_info.txt
rcli bench --suite all --output benchmark.json

# Attach system_info.txt and benchmark.json to GitHub issue

Include:

macOS version: sw_vers
RCLI version: rcli --version
Hardware: sysctl hw.model
Crash logs (if applicable)

Reset to Defaults

Clear All Configuration

rm -rf ~/Library/RCLI/config/
rcli setup  # Re-initialize

Remove All Models

rcli cleanup  # Interactive cleanup
# or
rm -rf ~/Library/RCLI/models/
rcli setup  # Re-download defaults

Clear RAG Index

rm -rf ~/Library/RCLI/index/

Full Reset

rm -rf ~/Library/RCLI/
rcli setup  # Start fresh

Getting Help

Built-in Help

rcli --help              # CLI reference
rcli actions --help      # Action system help
rcli models --help       # Model management help
rcli bench --help        # Benchmark help

Community Support

GitHub Issues

Report bugs or request features

Contributing Guide

Build from source, architecture docs

Diagnostic Commands

# System info
rcli info

# Microphone test
rcli mic-test

# Full benchmark
rcli bench --suite all --output diagnostic.json

# Tool calling test
rcli bench --suite tools

# RAG test
rcli rag status

Get Started

Core Features

Commands

Models

Actions

Advanced

Development

​Common Issues

​Microphone Permissions

​Out of Memory (OOM)

​Slow Inference

​STT Transcription Errors

​TTS Audio Glitches

​Tool Calling Failures

​RAG Retrieval Errors

​Model Download Failures

​Debugging

​Enable Verbose Logging

​Debug Logging

​Redirect Logs to File

​Tool Call Trace

​Inspect Memory Pool

​Profile Performance

​Error Messages

​Platform-Specific Issues

​macOS Ventura (13.0+)

​macOS Sonoma (14.0+)

​Apple Silicon Rosetta

​Performance Debugging

​High CPU Usage

​High Memory Usage

​GPU Not Utilized

​Ring Buffer Overruns

​Crash Reports

​Generate Crash Report

​Useful Info for Bug Reports

​Reset to Defaults

​Clear All Configuration

​Remove All Models

​Clear RAG Index

​Full Reset

​Getting Help

​Built-in Help

​Community Support

GitHub Issues

Contributing Guide

​Diagnostic Commands

​Next Steps

Architecture

Performance

Build docs developers (and LLMs) love

Common Issues

Microphone Permissions

Out of Memory (OOM)

Slow Inference

STT Transcription Errors

TTS Audio Glitches

Tool Calling Failures

RAG Retrieval Errors

Model Download Failures

Debugging

Enable Verbose Logging

Debug Logging

Redirect Logs to File

Tool Call Trace

Inspect Memory Pool

Profile Performance

Error Messages

Platform-Specific Issues

macOS Ventura (13.0+)

macOS Sonoma (14.0+)

Apple Silicon Rosetta

Performance Debugging

High CPU Usage

High Memory Usage

GPU Not Utilized

Ring Buffer Overruns

Crash Reports

Generate Crash Report

Useful Info for Bug Reports

Reset to Defaults

Clear All Configuration

Remove All Models

Clear RAG Index

Full Reset

Getting Help

Built-in Help

Community Support

Diagnostic Commands

Next Steps