State & Performance Metrics

Overview

RCLI provides functions to query:

Pipeline state: Current processing stage (idle, listening, processing, speaking)
Engine info: Model names, device info, configuration
Audio levels: Real-time microphone RMS for waveform displays
Performance metrics: LLM, TTS, and STT performance from last operation
Context usage: Token counts and context window utilization

Pipeline State

rcli_get_state

Get the current pipeline state.

int rcli_get_state(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

int

Pipeline state:

0: IDLE - Not processing
1: LISTENING - Capturing microphone input
2: PROCESSING - Running LLM inference
3: SPEAKING - Playing TTS audio
4: INTERRUPTED - Processing was cancelled

Example: State Polling

const char* state_names[] = {"IDLE", "LISTENING", "PROCESSING", "SPEAKING", "INTERRUPTED"};

int state = rcli_get_state(handle);
printf("Current state: %s\n", state_names[state]);

// Wait for idle
while (rcli_get_state(handle) != 0) {
    usleep(100000);  // Poll every 100ms
}
printf("Pipeline is idle\n");

Use rcli_set_state_callback() for event-driven state updates instead of polling.

Engine Information

rcli_get_info

Get engine info as JSON (model names, device info, etc.).

const char* rcli_get_info(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle (must be initialized)

return

const char*

JSON string with engine information. Do not free - owned by the engine.Example:

{
  "llm_model": "Qwen3 0.6B",
  "stt_model": "Whisper base.en",
  "tts_model": "Piper Lessac",
  "gpu_layers": 99,
  "context_size": 4096,
  "device": "Apple M1 Max",
  "actions_count": 43,
  "actions_enabled": 38
}

Example

const char* info = rcli_get_info(handle);
printf("Engine Info:\n%s\n", info);

Model Queries

rcli_get_llm_model

Get the name of the active LLM model.

const char* rcli_get_llm_model(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

const char*

Model name (e.g., "Qwen3 0.6B", "Liquid LFM2 1.2B"). Do not free.

rcli_get_tts_model

Get the name of the active TTS voice.

const char* rcli_get_tts_model(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

const char*

Voice name (e.g., "Piper Lessac", "Kokoro American Female"). Do not free.

rcli_get_stt_model

Get the name of the active offline STT model.

const char* rcli_get_stt_model(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

const char*

Model name (e.g., "Whisper base.en", "Parakeet TDT 1.1B"). Do not free.

Example: Print Active Models

printf("Active Models:\n");
printf("  LLM: %s\n", rcli_get_llm_model(handle));
printf("  STT: %s\n", rcli_get_stt_model(handle));
printf("  TTS: %s\n", rcli_get_tts_model(handle));

// Output:
// Active Models:
//   LLM: Qwen3 0.6B
//   STT: Whisper base.en
//   TTS: Piper Lessac

rcli_is_using_parakeet

Check if using Parakeet TDT (high-accuracy STT) vs. Whisper.

int rcli_is_using_parakeet(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

int

1: Using Parakeet TDT (high-accuracy)
0: Using Whisper (default)

Audio Level

rcli_get_audio_level

Get current microphone audio level (RMS) for waveform display.

float rcli_get_audio_level(RCLIHandle handle);

handle

RCLIHandle

required

Engine handle

return

float

Audio level (0.0 - 1.0 RMS). 0.0 = silence, 1.0 = maximum.

Example: Waveform Visualizer

rcli_start_listening(handle);

for (int i = 0; i < 100; i++) {
    float level = rcli_get_audio_level(handle);
    int bars = (int)(level * 20);
    
    printf("\r[");
    for (int j = 0; j < 20; j++) {
        printf("%s", j < bars ? "█" : "·");
    }
    printf("] %.2f", level);
    fflush(stdout);
    
    usleep(50000);  // 50ms refresh
}

// Output:
// [███████·············] 0.35

Performance Metrics

rcli_get_last_llm_perf

Get LLM performance from the last rcli_process_command() or rcli_rag_query() call.

void rcli_get_last_llm_perf(
    RCLIHandle handle,
    int* out_tokens,
    double* out_tok_per_sec,
    double* out_ttft_ms,
    double* out_total_ms
);

handle

RCLIHandle

required

Engine handle

out_tokens

int*

Output: Number of tokens generated. Pass NULL to skip.

out_tok_per_sec

double*

Output: Token generation speed (tokens/second). Pass NULL to skip.

out_ttft_ms

double*

Output: Time to first token in milliseconds. Pass NULL to skip.

out_total_ms

double*

Output: Total generation time in milliseconds. Pass NULL to skip.

Example: Display LLM Stats

const char* response = rcli_process_command(handle, "Tell me a joke");

int tokens;
double tok_per_sec, ttft_ms, total_ms;

rcli_get_last_llm_perf(handle, &tokens, &tok_per_sec, &ttft_ms, &total_ms);

printf("LLM Performance:\n");
printf("  Tokens:     %d\n", tokens);
printf("  Speed:      %.1f tok/s\n", tok_per_sec);
printf("  TTFT:       %.1f ms\n", ttft_ms);
printf("  Total:      %.1f ms\n", total_ms);

// Output:
// LLM Performance:
//   Tokens:     42
//   Speed:      38.5 tok/s
//   TTFT:       89.2 ms
//   Total:      1091.3 ms

rcli_get_context_info

Get context window usage from the last LLM call.

void rcli_get_context_info(
    RCLIHandle handle,
    int* out_prompt_tokens,
    int* out_ctx_size
);

handle

RCLIHandle

required

Engine handle

out_prompt_tokens

int*

Output: Total tokens in the last prompt (system + history + user). Pass NULL to skip.

out_ctx_size

int*

Output: Model’s configured context window size. Pass NULL to skip.

Example: Context Usage

rcli_process_command(handle, "Explain quantum computing");

int prompt_tokens, ctx_size;
rcli_get_context_info(handle, &prompt_tokens, &ctx_size);

float usage = (float)prompt_tokens / ctx_size * 100;

printf("Context Usage: %d / %d tokens (%.1f%%)\n",
       prompt_tokens, ctx_size, usage);

// Output:
// Context Usage: 512 / 4096 tokens (12.5%)

If context usage exceeds ~80%, consider clearing conversation history with rcli_clear_history().

rcli_get_last_tts_perf

Get TTS performance from the last rcli_speak() call.

void rcli_get_last_tts_perf(
    RCLIHandle handle,
    int* out_samples,
    double* out_synthesis_ms,
    double* out_rtf
);

handle

RCLIHandle

required

Engine handle

out_samples

int*

Output: Number of audio samples generated. Pass NULL to skip.

out_synthesis_ms

double*

Output: Synthesis time in milliseconds. Pass NULL to skip.

out_rtf

double*

Output: Real-time factor (RTF < 1.0 means faster than real-time). Pass NULL to skip.

Example: TTS Stats

rcli_speak(handle, "Hello, world!");

int samples;
double synthesis_ms, rtf;

rcli_get_last_tts_perf(handle, &samples, &synthesis_ms, &rtf);

printf("TTS Performance:\n");
printf("  Samples:    %d\n", samples);
printf("  Synthesis:  %.1f ms\n", synthesis_ms);
printf("  RTF:        %.2f\n", rtf);

// Output:
// TTS Performance:
//   Samples:    22050
//   Synthesis:  123.4 ms
//   RTF:        0.45

rcli_get_last_stt_perf

Get STT performance from the last rcli_stop_capture_and_transcribe() call.

void rcli_get_last_stt_perf(
    RCLIHandle handle,
    double* out_audio_ms,
    double* out_transcribe_ms
);

handle

RCLIHandle

required

Engine handle

out_audio_ms

double*

Output: Audio duration in milliseconds. Pass NULL to skip.

out_transcribe_ms

double*

Output: Transcription time in milliseconds. Pass NULL to skip.

Example: STT Stats

rcli_start_capture(handle);
// ... user speaks ...
const char* transcript = rcli_stop_capture_and_transcribe(handle);

double audio_ms, transcribe_ms;
rcli_get_last_stt_perf(handle, &audio_ms, &transcribe_ms);

float rtf = transcribe_ms / audio_ms;

printf("STT Performance:\n");
printf("  Audio:      %.1f ms\n", audio_ms);
printf("  Transcribe: %.1f ms\n", transcribe_ms);
printf("  RTF:        %.2f\n", rtf);

// Output:
// STT Performance:
//   Audio:      3500.0 ms
//   Transcribe: 234.5 ms
//   RTF:        0.07

Complete Example: Performance Dashboard

#include "api/rcli_api.h"
#include <stdio.h>

void print_performance_stats(RCLIHandle handle) {
    // LLM stats
    int llm_tokens;
    double llm_tok_per_sec, llm_ttft_ms, llm_total_ms;
    rcli_get_last_llm_perf(handle, &llm_tokens, &llm_tok_per_sec, &llm_ttft_ms, &llm_total_ms);

    // Context usage
    int prompt_tokens, ctx_size;
    rcli_get_context_info(handle, &prompt_tokens, &ctx_size);

    // TTS stats
    int tts_samples;
    double tts_synthesis_ms, tts_rtf;
    rcli_get_last_tts_perf(handle, &tts_samples, &tts_synthesis_ms, &tts_rtf);

    printf("\n┌──────────────────────────────┐\n");
    printf("│ Performance Metrics         │\n");
    printf("├──────────────────────────────┤\n");
    printf("│ LLM                          │\n");
    printf("│   Tokens:     %-14d │\n", llm_tokens);
    printf("│   Speed:      %-10.1f tok/s │\n", llm_tok_per_sec);
    printf("│   TTFT:       %-14.1f ms │\n", llm_ttft_ms);
    printf("│   Total:      %-14.1f ms │\n", llm_total_ms);
    printf("│   Context:    %d/%d (%.1f%%)  │\n",
           prompt_tokens, ctx_size, (float)prompt_tokens / ctx_size * 100);
    printf("├──────────────────────────────┤\n");
    printf("│ TTS                          │\n");
    printf("│   Samples:    %-14d │\n", tts_samples);
    printf("│   Synthesis:  %-14.1f ms │\n", tts_synthesis_ms);
    printf("│   RTF:        %-14.2f │\n", tts_rtf);
    printf("└──────────────────────────────┘\n");
}

int main() {
    RCLIHandle handle = rcli_create(NULL);
    rcli_init(handle, "/path/to/models", 99);

    // Process command
    const char* response = rcli_process_command(handle, "Tell me about black holes");
    printf("Response: %s\n", response);

    // Speak response
    rcli_speak(handle, response);

    // Display performance stats
    print_performance_stats(handle);

    rcli_destroy(handle);
    return 0;
}

C API

CLI Reference

State & Performance Metrics

Overview

Pipeline State

rcli_get_state

Example: State Polling

Engine Information

rcli_get_info

Example

Model Queries

rcli_get_llm_model

rcli_get_tts_model

rcli_get_stt_model

Example: Print Active Models

rcli_is_using_parakeet

Audio Level

rcli_get_audio_level

Example: Waveform Visualizer

Performance Metrics

rcli_get_last_llm_perf

Example: Display LLM Stats

rcli_get_context_info

Example: Context Usage

rcli_get_last_tts_perf

Example: TTS Stats

rcli_get_last_stt_perf

Example: STT Stats

Complete Example: Performance Dashboard

See Also

Build docs developers (and LLMs) love

C API

CLI Reference

​Overview

​Pipeline State

​rcli_get_state

​Example: State Polling

​Engine Information

​rcli_get_info

​Example

​Model Queries

​rcli_get_llm_model

​rcli_get_tts_model

​rcli_get_stt_model

​Example: Print Active Models

​rcli_is_using_parakeet

​Audio Level

​rcli_get_audio_level

​Example: Waveform Visualizer

​Performance Metrics

​rcli_get_last_llm_perf

​Example: Display LLM Stats

​rcli_get_context_info

​Example: Context Usage

​rcli_get_last_tts_perf

​Example: TTS Stats

​rcli_get_last_stt_perf

​Example: STT Stats

​Complete Example: Performance Dashboard

​See Also

Build docs developers (and LLMs) love

Overview

Pipeline State

rcli_get_state

Example: State Polling

Engine Information

rcli_get_info

Example

Model Queries

rcli_get_llm_model

rcli_get_tts_model

rcli_get_stt_model

Example: Print Active Models

rcli_is_using_parakeet

Audio Level

rcli_get_audio_level

Example: Waveform Visualizer

Performance Metrics

rcli_get_last_llm_perf

Example: Display LLM Stats

rcli_get_context_info

Example: Context Usage

rcli_get_last_tts_perf

Example: TTS Stats

rcli_get_last_stt_perf

Example: STT Stats

Complete Example: Performance Dashboard

See Also