Introduction
The Moonshine C API is the low-level interface that all other language bindings (Python, Swift, Java, etc.) use to interact with the Moonshine Voice library. This API provides direct access to transcription, streaming, and intent recognition capabilities.Most developers should use the language-specific bindings for their platform (Python, Swift, Java, etc.) rather than the C API directly. The C API is primarily intended for:
- Creating new language bindings
- Porting to new platforms
- Low-level optimization and debugging
Key Features
- Thread-safe: All API calls are thread-safe and can be called from multiple threads concurrently
- Flexible input: Supports any length audio input at various sample rates (16kHz recommended)
- Streaming support: Incremental transcription with caching for low-latency real-time applications
- Multiple languages: English, Spanish, Mandarin, Japanese, Korean, Vietnamese, Ukrainian, and Arabic
- Intent recognition: Semantic matching for voice command interfaces
Architecture Overview
The Moonshine library processes audio through these main components:- Transcriber: Core engine that loads models and manages transcription
- Streams: Handlers for individual audio input sources (one transcriber can manage multiple streams)
- Transcripts: Collections of transcript lines representing detected speech segments
- Intent Recognizer: Semantic matching engine for command recognition
Basic Workflow
Non-Streaming Transcription
For transcribing complete audio files or recordings:Streaming Transcription
For real-time transcription from microphones or live audio sources:Audio Format Requirements
Audio data must be floating-point PCM values between -1.0 and 1.0
Only mono (single channel) audio is supported
While the library supports various sample rates, 16kHz is recommended to avoid resampling overhead
Model Files
The transcriber expects three files in the model directory:encoder_model.ort- Quantized ONNX encoder modeldecoder_model_merged.ort- Quantized ONNX decoder modeltokenizer.bin- Token-to-character mapping in binary format
Error Handling
Most functions return error codes that can be converted to human-readable strings:Common Error Codes
| Code | Constant | Description |
|---|---|---|
| 0 | MOONSHINE_ERROR_NONE | Success |
| -1 | MOONSHINE_ERROR_UNKNOWN | Unknown error |
| -2 | MOONSHINE_ERROR_INVALID_HANDLE | Invalid transcriber or stream handle |
| -3 | MOONSHINE_ERROR_INVALID_ARGUMENT | Invalid function argument |
Thread Safety
All API calls are thread-safe. However, calculations on a single transcriber are serialized, so concurrent calls to the same transcriber from multiple threads will be processed sequentially, affecting latency. For best performance with multiple audio sources:- Use multiple streams on a single transcriber (shares model resources)
- Or create separate transcribers for truly parallel processing
Memory Management
Next Steps
Function Reference
Detailed documentation for all C API functions
Python API
Higher-level Python interface (recommended for most users)