Skip to main content

Common Issues

Microphone Permissions

Symptom: No audio captured, STT returns empty transcripts Cause: macOS requires explicit microphone permissions for Terminal apps. Fix:
  1. Open System SettingsPrivacy & SecurityMicrophone
  2. Enable microphone access for:
    • Terminal.app (if running from Terminal)
    • iTerm.app (if using iTerm2)
    • Your terminal emulator
  3. Restart terminal and try again:
    rcli mic-test  # Test microphone capture
    
If you installed via Homebrew, the rcli binary is in /opt/homebrew/bin/. macOS prompts for permissions on first microphone access.

Out of Memory (OOM)

Symptom: rcli crashes with “Killed: 9” or “malloc failed” Cause: Model too large for available RAM, or GPU layers too high. Fix:
  1. Reduce GPU layers:
    rcli --gpu-layers 20  # Use hybrid CPU/GPU
    
  2. Use smaller model:
    rcli models llm       # Switch to Qwen3 0.6B or LFM2 350M
    
  3. Reduce context size:
    rcli --ctx-size 2048  # Smaller context = less KV cache memory
    
  4. Disable mlock:
    export RCLI_LLM_USE_MLOCK=0
    rcli
    

Slow Inference

Symptom: LLM generates <50 tok/s, high latency Cause: CPU-only inference, or insufficient GPU layers. Fix:
  1. Check GPU layers:
    rcli info  # Look for "GPU layers: 99"
    
  2. Enable Metal GPU:
    export RCLI_LLM_GPU_LAYERS=99
    rcli
    
  3. Reduce thread count (if GPU-enabled):
    export RCLI_LLM_THREADS=1  # GPU-bound: 1 thread is optimal
    
  4. Enable Flash Attention:
    export RCLI_LLM_FLASH_ATTN=1
    

STT Transcription Errors

Symptom: STT returns gibberish or empty text Possible Causes:
  1. Background noise: VAD filters out speech
    rcli mic-test  # Check audio levels
    
  2. Silence threshold too low:
    export RCLI_STT_SILENCE_MS=1200  # Increase from 800ms
    
  3. Wrong sample rate:
    export RCLI_STT_SAMPLE_RATE=16000  # Must be 16kHz
    
  4. Model mismatch: Try switching STT models:
    rcli upgrade-stt  # Install Parakeet TDT (higher accuracy)
    

TTS Audio Glitches

Symptom: Choppy playback, crackling, or silence Possible Causes:
  1. Ring buffer underrun: Playback faster than synthesis
    export RCLI_AUDIO_PLAYBACK_BUFFER=88064  # Double buffer size
    
  2. CPU throttling: Check Activity Monitor
    • Quit other apps to free CPU/GPU
  3. Sample rate mismatch:
    export RCLI_TTS_SAMPLE_RATE=22050  # Piper default
    
  4. Voice model corruption:
    rcli cleanup  # Remove corrupted models
    rcli voices   # Re-download voice
    

Tool Calling Failures

Symptom: LLM doesn’t execute actions, or parses wrong tool Possible Causes:
  1. Model doesn’t support tool calling: Use LFM2 1.2B Tool or Qwen3.5 2B+
    rcli models llm  # Switch to tool-capable model
    
  2. Tool definitions not loaded:
    rcli -v  # Check logs for "Tool definitions: 43 actions"
    
  3. Action disabled:
    rcli actions enable open_app
    
  4. Parse errors: Enable tool trace to debug:
    # In TUI: Press T to toggle tool trace
    # Shows: Tool call → Arguments → Execution result
    

RAG Retrieval Errors

Symptom: rcli ask --rag returns “Index not found” Fix:
  1. Build index first:
    rcli rag ingest ~/Documents/notes
    
  2. Check index path:
    ls ~/Library/RCLI/index/
    # Should contain: chunks.bin, metadata.json, usearch.index
    
  3. Specify index explicitly:
    rcli ask --rag ~/Library/RCLI/index "What is the project deadline?"
    

Model Download Failures

Symptom: rcli setup or rcli models fails to download models Possible Causes:
  1. Network timeout:
    export RCLI_DOWNLOAD_TIMEOUT=600  # 10 min timeout
    rcli setup
    
  2. Disk space:
    df -h ~/Library/RCLI/models/  # Check free space
    rcli cleanup                   # Remove unused models
    
  3. Hugging Face rate limit: Wait 5-10 min and retry
  4. Corrupted download:
    rm -rf ~/Library/RCLI/models/llm/qwen3-0.6b-q4_k_m.gguf
    rcli models llm  # Re-download
    

Debugging

Enable Verbose Logging

rcli -v
# or
rcli --verbose
Outputs:
[HW] platform=macos cpu=14(p=10 e=4) ram=36864MB gpu_layers=99
[Pool] Allocated 128MB memory pool
[STT] "open Safari" (final, 1.2s audio, 43.7ms)
[LLM] first_token=22.5ms total=156.3ms 159.6 tok/s
[TTS] Synthesizing: "Opening Safari now."
[Pipeline] E2E latency: 131ms

Debug Logging

export RCLI_LOG_LEVEL=DEBUG
rcli
Outputs detailed logs:
  • Memory pool allocations
  • Ring buffer read/write operations
  • KV cache hits/misses
  • Tool call parsing steps

Redirect Logs to File

rcli 2> rcli.log  # Stderr to file
tail -f rcli.log  # Watch logs live

Tool Call Trace

In TUI, press T to toggle tool call trace. Shows:
> open Safari
  ~ [TRACE] Tool call: open_app({"app_name": "Safari"})
  ~ [TRACE] open_app -> OK: {"success": true, "output": "Opened Safari"}
  RCLI: Done! Safari is now open.
Or via CLI:
rcli bench --suite tools  # Test tool calling accuracy

Inspect Memory Pool

rcli info
Outputs:
Memory pool: 128 MB allocated
Used: 58.2 MB (45.5%)
High-water mark: 59.1 MB
Ring buffers:
  - Capture:  16384 samples (65 KB)
  - Playback: 44032 samples (172 KB)

Profile Performance

rcli bench --suite all --output profile.json
cat profile.json | jq '.llm'
Outputs:
{
  "model": "qwen3-0.6b",
  "first_token_ms": 22.5,
  "throughput_tps": 159.6,
  "kv_cache_hit_rate": 0.98,
  "memory_mb": 584
}

Error Messages

Cause: Pre-allocated memory pool exhausted.Fix:
export RCLI_POOL_SIZE_MB=256  # Increase pool size
rcli
Or reduce audio buffer sizes:
export RCLI_AUDIO_CAPTURE_BUFFER=8192
export RCLI_AUDIO_PLAYBACK_BUFFER=22016
Possible Causes:
  1. Corrupted model file:
    rm ~/Library/RCLI/models/llm/*.gguf
    rcli models llm  # Re-download
    
  2. Incompatible GGUF version:
    # Update llama.cpp (rebuild from source)
    cd deps/llama.cpp && git pull && cd ../..
    rm -rf build && mkdir build && cd build
    cmake .. -DCMAKE_BUILD_TYPE=Release
    cmake --build . -j$(sysctl -n hw.ncpu)
    
  3. GPU layers too high:
    rcli --gpu-layers 0  # CPU-only fallback
    
Cause: CoreAudio device not found or permissions denied.Fix:
  1. Check microphone permissions (see above)
  2. Test microphone:
    rcli mic-test
    
  3. Check audio device:
    # macOS System Settings → Sound → Input
    # Ensure correct device selected
    
  4. Restart CoreAudio:
    sudo killall coreaudiod
    
Impact: Non-fatal warning. Pipeline continues without VAD.Effect: All audio (including silence) sent to STT. May produce phantom transcripts.Fix:
# Re-download VAD model
rm ~/Library/RCLI/models/vad/silero_vad.onnx
rcli setup
Impact: Non-fatal warning. Falls back to Zipformer.Effect: Lower STT accuracy (Zipformer ~8% WER vs Whisper ~5% WER).Fix:
# Install Whisper or Parakeet
rcli models stt  # Select whisper-base.en or parakeet-tdt
Cause: Context window exceeded. Conversation history too long.Effect: KV cache cleared, next response slower (no cache reuse).Fix:
# Increase context size
rcli --ctx-size 8192

# Or reduce conversation history turns
export RCLI_CONVERSATION_HISTORY_TURNS=5

Platform-Specific Issues

macOS Ventura (13.0+)

Issue: “Operation not permitted” errors Fix: Grant Full Disk Access to Terminal:
  1. System SettingsPrivacy & SecurityFull Disk Access
  2. Add Terminal.app or iTerm.app
  3. Restart terminal

macOS Sonoma (14.0+)

Issue: Metal shader compilation warnings Fix: Update to latest macOS patch (Metal shader cache rebuild)

Apple Silicon Rosetta

Issue: Running x86_64 build on ARM64 Fix: Install ARM64 native build:
arch -arm64 brew install rcli
Verify:
file $(which rcli)  # Should show "arm64"

Performance Debugging

High CPU Usage

Symptom: CPU at 100%, fan spinning Possible Causes:
  1. CPU-only inference: Enable GPU layers
    rcli --gpu-layers 99
    
  2. Too many threads:
    export RCLI_LLM_THREADS=1
    export RCLI_LLM_THREADS_BATCH=4
    
  3. Large batch size:
    export RCLI_LLM_BATCH=1024  # Reduce from 2048
    

High Memory Usage

Symptom: Memory pressure, swap usage Possible Causes:
  1. Large KV cache:
    rcli --ctx-size 2048  # Reduce context
    
  2. mlock enabled:
    export RCLI_LLM_USE_MLOCK=0  # Disable weight pinning
    
  3. Multiple models loaded:
    rcli cleanup  # Remove unused models
    
  4. Large memory pool:
    export RCLI_POOL_SIZE_MB=64  # Reduce from 128
    

GPU Not Utilized

Symptom: GPU idle, slow inference Check:
rcli info  # Look for "GPU layers: 0" (bad) vs "GPU layers: 99" (good)
Fix:
export RCLI_LLM_GPU_LAYERS=99
rcli

Ring Buffer Overruns

Symptom: Choppy audio, dropped samples Check logs:
[WARN] Ring buffer overrun: 512 samples dropped
Fix:
# Increase buffer sizes
export RCLI_AUDIO_CAPTURE_BUFFER=32768
export RCLI_AUDIO_PLAYBACK_BUFFER=88064

Crash Reports

Generate Crash Report

If rcli crashes, macOS generates a crash report:
open ~/Library/Logs/DiagnosticReports/
# Look for rcli_*.crash or rcli_*.ips

Useful Info for Bug Reports

rcli info > system_info.txt
rcli bench --suite all --output benchmark.json

# Attach system_info.txt and benchmark.json to GitHub issue
Include:
  • macOS version: sw_vers
  • RCLI version: rcli --version
  • Hardware: sysctl hw.model
  • Crash logs (if applicable)

Reset to Defaults

Clear All Configuration

rm -rf ~/Library/RCLI/config/
rcli setup  # Re-initialize

Remove All Models

rcli cleanup  # Interactive cleanup
# or
rm -rf ~/Library/RCLI/models/
rcli setup  # Re-download defaults

Clear RAG Index

rm -rf ~/Library/RCLI/index/

Full Reset

rm -rf ~/Library/RCLI/
rcli setup  # Start fresh

Getting Help

Built-in Help

rcli --help              # CLI reference
rcli actions --help      # Action system help
rcli models --help       # Model management help
rcli bench --help        # Benchmark help

Community Support

GitHub Issues

Report bugs or request features

Contributing Guide

Build from source, architecture docs

Diagnostic Commands

# System info
rcli info

# Microphone test
rcli mic-test

# Full benchmark
rcli bench --suite all --output diagnostic.json

# Tool calling test
rcli bench --suite tools

# RAG test
rcli rag status

Next Steps

Architecture

Deep dive into pipeline design and threading

Performance

Benchmark results and optimization tips

Build docs developers (and LLMs) love