C API Overview

Introduction

The Moonshine C API is the low-level interface that all other language bindings (Python, Swift, Java, etc.) use to interact with the Moonshine Voice library. This API provides direct access to transcription, streaming, and intent recognition capabilities.

Most developers should use the language-specific bindings for their platform (Python, Swift, Java, etc.) rather than the C API directly. The C API is primarily intended for:

Creating new language bindings
Porting to new platforms
Low-level optimization and debugging

Key Features

Thread-safe: All API calls are thread-safe and can be called from multiple threads concurrently
Flexible input: Supports any length audio input at various sample rates (16kHz recommended)
Streaming support: Incremental transcription with caching for low-latency real-time applications
Multiple languages: English, Spanish, Mandarin, Japanese, Korean, Vietnamese, Ukrainian, and Arabic
Intent recognition: Semantic matching for voice command interfaces

Architecture Overview

The Moonshine library processes audio through these main components:

Transcriber: Core engine that loads models and manages transcription
Streams: Handlers for individual audio input sources (one transcriber can manage multiple streams)
Transcripts: Collections of transcript lines representing detected speech segments
Intent Recognizer: Semantic matching engine for command recognition

Basic Workflow

Non-Streaming Transcription

For transcribing complete audio files or recordings:

#include "moonshine-c-api.h"

int main(int argc, char *argv[]) {
  // Load the transcriber
  int32_t transcriber_handle = moonshine_load_transcriber_from_files(
    "path/to/models", MOONSHINE_MODEL_ARCH_BASE, NULL, 0,
    MOONSHINE_HEADER_VERSION);
  if (transcriber_handle < 0) {
    fprintf(stderr, "Failed to load transcriber\n");
    return 1;
  }

  // Prepare audio data (16kHz float PCM, values between -1.0 and 1.0)
  float audio_data[32000] = {};
  size_t audio_length = 32000;
  int32_t sample_rate = 16000;
  
  // Transcribe the audio
  transcript_t *transcript = NULL;
  int32_t error = moonshine_transcribe_without_streaming(transcriber_handle,
    audio_data, audio_length, sample_rate, 0, &transcript);
  if (error != 0) {
    fprintf(stderr, "Failed to transcribe\n");
    return 1;
  }
  
  // Process results
  for (size_t i = 0; i < transcript->line_count; i++) {
    printf("Line %zu at %f seconds: %s\n", i, transcript->lines[i].start_time,
      transcript->lines[i].text);
  }
  
  // Clean up
  moonshine_free_transcriber(transcriber_handle);
  return 0;
}

Streaming Transcription

For real-time transcription from microphones or live audio sources:

// Load transcriber
int32_t transcriber_handle = moonshine_load_transcriber_from_files(
    "path/to/models", MOONSHINE_MODEL_ARCH_BASE_STREAMING, NULL, 0,
    MOONSHINE_HEADER_VERSION);

// Create and start a stream
int32_t stream_handle = moonshine_create_stream(transcriber_handle, 0);
moonshine_start_stream(transcriber_handle, stream_handle);

// Feed audio chunks as they become available
float* latest_audio_data;
size_t latest_audio_data_length;
while (get_audio_from_microphone(&latest_audio_data, &latest_audio_data_length)) {
  moonshine_transcribe_add_audio_to_stream(transcriber_handle,
    stream_handle, latest_audio_data, latest_audio_data_length,
    microphone_sample_rate, 0);
  
  // Get updated transcript periodically
  transcript_t *partial_transcript = NULL;
  moonshine_transcribe_stream(transcriber_handle,
    stream_handle, 0, &partial_transcript);
  print_transcript(partial_transcript);
}

// Stop and get final results
moonshine_stop_stream(transcriber_handle, stream_handle);
transcript_t *final_transcript = NULL;
moonshine_transcribe_stream(transcriber_handle, stream_handle, 0,
  &final_transcript);

// Clean up
moonshine_free_stream(transcriber_handle, stream_handle);
moonshine_free_transcriber(transcriber_handle);

Audio Format Requirements

format

float PCM

Audio data must be floating-point PCM values between -1.0 and 1.0

channels

mono

Only mono (single channel) audio is supported

sample_rate

16000 Hz recommended

While the library supports various sample rates, 16kHz is recommended to avoid resampling overhead

Model Files

The transcriber expects three files in the model directory:

encoder_model.ort - Quantized ONNX encoder model
decoder_model_merged.ort - Quantized ONNX decoder model
tokenizer.bin - Token-to-character mapping in binary format

Use the Python package’s download script to obtain these files:

python -m moonshine_voice.download --language en

Error Handling

Most functions return error codes that can be converted to human-readable strings:

int32_t result = moonshine_transcribe_without_streaming(...);
if (result != MOONSHINE_ERROR_NONE) {
  const char* error_msg = moonshine_error_to_string(result);
  fprintf(stderr, "Error: %s\n", error_msg);
}

Common Error Codes

Code	Constant	Description
0	`MOONSHINE_ERROR_NONE`	Success
-1	`MOONSHINE_ERROR_UNKNOWN`	Unknown error
-2	`MOONSHINE_ERROR_INVALID_HANDLE`	Invalid transcriber or stream handle
-3	`MOONSHINE_ERROR_INVALID_ARGUMENT`	Invalid function argument

Thread Safety

All API calls are thread-safe. However, calculations on a single transcriber are serialized, so concurrent calls to the same transcriber from multiple threads will be processed sequentially, affecting latency. For best performance with multiple audio sources:

Use multiple streams on a single transcriber (shares model resources)
Or create separate transcribers for truly parallel processing

Memory Management

Transcript data returned by the library is owned by the transcriber and is valid only until:

The next call to that transcriber
The transcriber is freed with moonshine_free_transcriber()

Make copies of any data you need to retain beyond these points.

Next Steps

Function Reference

Detailed documentation for all C API functions

Python API

Higher-level Python interface (recommended for most users)

Python API

Swift API

Java API

C++ API

C API

Introduction

Key Features

Architecture Overview

Basic Workflow

Non-Streaming Transcription

Streaming Transcription

Audio Format Requirements

Model Files

Error Handling

Common Error Codes

Thread Safety

Memory Management

Next Steps

Function Reference

Python API

Build docs developers (and LLMs) love

Python API

Swift API

Java API

C++ API

C API

​Introduction

​Key Features

​Architecture Overview

​Basic Workflow

​Non-Streaming Transcription

​Streaming Transcription

​Audio Format Requirements

​Model Files

​Error Handling

​Common Error Codes

​Thread Safety

​Memory Management

​Next Steps

Function Reference

Python API

Build docs developers (and LLMs) love

Introduction

Key Features

Architecture Overview

Basic Workflow

Non-Streaming Transcription

Streaming Transcription

Audio Format Requirements

Model Files

Error Handling

Common Error Codes

Thread Safety

Memory Management

Next Steps