Getting Started
Before making changes, ensure you can build and test RCLI locally.Architecture Overview
Understanding the pipeline helps you identify where to make changes:Threading Model
Three threads run concurrently in live mode:STT Thread
STT Thread
- Captures microphone audio via CoreAudio
- Runs Silero VAD to filter silence
- Detects speech endpoints
- Feeds audio to Zipformer (streaming) or Whisper (batch)
- Synchronizes via
std::condition_variable
LLM Thread
LLM Thread
- Waits for transcribed text from STT
- Generates tokens via llama.cpp with Metal GPU
- Dispatches tool calls when detected
- Feeds sentences to TTS via
SentenceDetector - Uses system prompt KV caching to avoid reprocessing
TTS Thread
TTS Thread
- Queues sentences from LLM
- Double-buffered playback (synthesizes next while playing current)
- Uses sherpa-onnx (Piper/Kokoro/KittenTTS)
- Outputs to CoreAudio speaker
Source Layout
Key Files
| File | Purpose |
|---|---|
src/api/rcli_api.h | Public C API — all engine functionality exposed here |
src/pipeline/orchestrator.h | Central class that owns all engines and coordinates data flow |
src/actions/action_registry.h | Action registration and dispatch |
src/models/model_registry.h | LLM model definitions (id, URL, size, speed, flags) |
src/models/tts_model_registry.h | TTS voice definitions |
src/models/stt_model_registry.h | STT model definitions |
src/tools/tool_engine.h | Tool call parsing and execution |
Testing
Test Executable
RCLI includes a test harness for verifying the pipeline:The
--actions-only suite runs without any model downloads and is a good smoke test for quick iteration.Manual Testing
Test changes interactively:Benchmarks
Run performance benchmarks:stt, llm, tts, e2e, tools, rag, memory, all
Code Style
Follow these conventions when contributing:Language and Standards
- C++17 with Apple Clang
- No external package manager — all dependencies vendored or CMake-fetched
- Header-only where practical for CLI modules (reduces build complexity)
Output and Formatting
- Avoid emojis in output strings — use plain text markers (
[ok],[PASS],>,*,Tip:) - Use
fprintf(stderr, ...)for user-facing output - Stdout is reserved for machine-parseable output (JSON, etc.)
Memory Management
- Pre-allocated memory pool (64 MB arena) — avoid runtime
mallocduring inference - Lock-free ring buffers for zero-copy audio transfer
- RAII for resource management
Design Patterns
- Orchestrator pattern — central class owns all engines
- Atomic pipeline state —
std::atomic<PipelineState> - Sentence-level TTS scheduling — flush complete sentences only
- System prompt KV caching — reuse llama.cpp state across queries
Good First Issues
Looking for a place to start? Try these:Add a macOS action
Implement a new action like
send_email or create_calendar_event.
See Adding Actions.Add an LLM model
Register a new GGUF model in
src/models/model_registry.h.Improve error messages
Make error messages more actionable and user-friendly.
Add a benchmark
Extend the benchmark suite in
src/bench/benchmark.cpp.Pull Request Process
Make your changes
- Write code following the style guide
- Add tests if applicable
- Ensure the build succeeds:
Commit with clear messages
Add:for new featuresFix:for bug fixesUpdate:for enhancementsRefactor:for code restructuring
Extension Points
Common contribution areas:Adding Actions
See the dedicated Adding Actions guide.Adding Models
LLM Models
LLM Models
Edit
src/models/model_registry.h and add to all_models():TTS Voices
TTS Voices
Edit
src/models/tts_model_registry.h and add to all_tts_models():architecture— sherpa-onnx backend (vits,kokoro,matcha,kitten)dir_name— subdirectory under~/Library/RCLI/models/download_url— URL to.tar.bz2archive
STT Models
STT Models
Edit
src/models/stt_model_registry.h:streaming— for live mic (e.g., Zipformer)offline— for batch transcription (e.g., Whisper, Parakeet)
Resources
GitHub Issues
Report bugs and request features
Project Structure
Understand the codebase layout
Building from Source
Build and install RCLI locally