Skip to main content

Introduction

The RCLI C API provides a unified interface for integrating on-device voice AI into your applications. It combines speech-to-text, large language models, text-to-speech, and tool calling into a single pipeline optimized for Apple Silicon.
The C API is designed as a bridging layer for Swift (macOS/iOS), Python, and other languages that can call C functions.

Architecture

Mic → VAD → STT (Zipformer) → LLM (Qwen3) → TTS (Piper) → Speaker

                            Tool Calling → macOS Actions

Key Features

  • Live voice pipeline: Continuous mic → STT → LLM → TTS → speaker loop
  • Push-to-talk: Capture audio, then transcribe with offline Whisper
  • Text commands: Direct LLM processing without STT
  • Tool calling: 43+ macOS actions (AppleScript, system control)
  • RAG: Document ingestion and hybrid retrieval (vector + BM25)
  • File mode: Process WAV files (iOS, testing)
  • Benchmarks: Comprehensive performance testing

Handle Management

All API functions operate on an opaque RCLIHandle:
typedef void* RCLIHandle;
Create a handle, initialize it, use it, then destroy it:
// Create
RCLIHandle handle = rcli_create(NULL);
if (!handle) {
    fprintf(stderr, "Failed to create RCLI engine\n");
    return -1;
}

// Initialize
if (rcli_init(handle, "/path/to/models", 99) != 0) {
    fprintf(stderr, "Initialization failed\n");
    rcli_destroy(handle);
    return -1;
}

// Use the API...
rcli_process_command(handle, "open Safari");

// Clean up
rcli_destroy(handle);
Always call rcli_destroy() to free resources, even if initialization fails.

Thread Safety

The RCLI engine uses internal mutexes for state management:
  • Safe to call from multiple threads
  • Callbacks fire on worker threads - marshal to UI thread if needed
  • rcli_stop_processing() is specifically designed for cross-thread cancellation

Memory Management

String ownership rules:
  • Strings returned by the API (e.g., rcli_process_command(), rcli_get_transcript()) are owned by the engine
  • Do NOT call free() on returned strings
  • Strings remain valid until the next call to the same function or until rcli_destroy()
  • Exception: rcli_get_timings() returns a malloc’d string that you must free()
// ✅ Correct
const char* response = rcli_process_command(handle, "hello");
printf("Response: %s\n", response);
// String is still owned by engine, no free() needed

// ✅ Correct (exception)
char* timings = rcli_get_timings(handle);
printf("%s\n", timings);
free(timings);  // Must free this one

// ❌ Wrong
const char* transcript = rcli_get_transcript(handle);
free(transcript);  // Don't do this!

Configuration

Pass optional JSON config to rcli_create():
const char* config = 
    "{"
    "  \"system_prompt\": \"You are a helpful assistant.\","
    "  \"gpu_layers\": 99,"
    "  \"ctx_size\": 4096"
    "}";

RCLIHandle handle = rcli_create(config);
system_prompt
string
Custom system prompt for the LLM. Defaults to built-in RCLI prompt.
gpu_layers
integer
Number of LLM layers to offload to GPU. Use 99 for all layers, 0 for CPU-only.
ctx_size
integer
LLM context window size in tokens. Default: 4096.

Pipeline States

The pipeline transitions through these states:
StateValueDescription
IDLE0Not processing
LISTENING1Capturing microphone input
PROCESSING2Running LLM inference
SPEAKING3Playing TTS audio
INTERRUPTED4Processing was cancelled
Query current state with rcli_get_state() or register a callback with rcli_set_state_callback().

Use Cases

macOS/iOS Swift Integration

import Foundation

let handle = rcli_create(nil)
rcli_init(handle, "/path/to/models", 99)
rcli_start_listening(handle)

Python Bindings

import ctypes

lib = ctypes.CDLL('./librcli.dylib')
lib.rcli_create.restype = ctypes.c_void_p
handle = lib.rcli_create(None)
lib.rcli_init(handle, b"/path/to/models", 99)

Voice Assistant

rcli_set_transcript_callback(handle, on_transcript, NULL);
rcli_set_state_callback(handle, on_state_change, NULL);
rcli_start_listening(handle);

// Voice pipeline runs until:
rcli_stop_listening(handle);

File Processing (iOS)

rcli_process_wav(handle, "input.wav", "output.wav", on_event, NULL);

Next Steps

Lifecycle

Initialize and manage the engine

Voice Pipeline

Live voice, push-to-talk, and TTS

Callbacks

Real-time events and state changes

Actions

Execute macOS actions and tools

Build docs developers (and LLMs) love