Transcriber

The Transcriber class is the core of Moonshine Voice, handling the speech-to-text pipeline including voice activity detection, audio segmentation, and transcription.

Class Definition

from moonshine_voice import Transcriber, ModelArch

transcriber = Transcriber(
    model_path: str,
    model_arch: ModelArch = ModelArch.BASE,
    update_interval: float = 0.5,
    options: dict = None
)

Constructor Parameters

model_path

str

required

Path to the directory containing model files (encoder_model.ort, decoder_model_merged.ort, tokenizer.bin)

model_arch

ModelArch

default:"ModelArch.BASE"

Model architecture to use. Options:

ModelArch.TINY - 26M parameters, fastest
ModelArch.BASE - 58M parameters, balanced
ModelArch.TINY_STREAMING - 34M parameters, streaming support
ModelArch.SMALL_STREAMING - 123M parameters, streaming support
ModelArch.MEDIUM_STREAMING - 245M parameters, highest accuracy

update_interval

float

default:"0.5"

How often (in seconds) to run transcription updates. Lower values provide more frequent updates but use more CPU.

options

dict

default:"None"

Advanced configuration options as string key-value pairs:

Show Available options

skip_transcription (“true”/“false”) - Skip text generation, only do VAD
max_tokens_per_second (“6.5” or “13.0”) - Hallucination detection threshold
vad_threshold (“0.0” to “1.0”) - Voice activity detection sensitivity
save_input_wav_path (path) - Save input audio for debugging
log_api_calls (“true”/“false”) - Log all API calls
identify_speakers (“true”/“false”) - Enable speaker identification
return_audio_data (“true”/“false”) - Include audio in transcript lines

Methods

transcribe_without_streaming

Transcribe pre-recorded audio without streaming.

transcript = transcriber.transcribe_without_streaming(
    audio_data: List[float],
    sample_rate: int = 16000,
    flags: int = 0
) -> Transcript

audio_data

List[float]

required

Audio samples as mono PCM floats between -1.0 and 1.0

sample_rate

int

default:"16000"

Sample rate in Hz. The library will resample to 16kHz internally.

flags

int

default:"0"

Reserved for future use

Returns: Transcript object with finalized transcription lines

start

Begin a new streaming transcription session.

transcriber.start()

Resets the transcript and prepares for new audio input. Must be called before add_audio().

stop

End the current streaming session.

transcriber.stop()

Marks any active line as complete and calls completion event listeners.

add_audio

Add audio data to the active stream.

transcriber.add_audio(
    audio_data: List[float],
    sample_rate: int = 16000
)

audio_data

List[float]

required

Mono PCM audio samples as floats (-1.0 to 1.0). Can be any chunk size.

sample_rate

int

default:"16000"

Sample rate of the input audio

update_transcription

Manually trigger a transcription update.

transcript = transcriber.update_transcription(
    flags: int = 0
) -> Transcript

flags

int

default:"0"

Use Transcriber.MOONSHINE_FLAG_FORCE_UPDATE to bypass the 200ms cache

Returns: Current Transcript object

create_stream

Create an additional stream for processing multiple audio sources.

stream = transcriber.create_stream(
    flags: int = 0,
    update_interval: float = None
) -> Stream

flags

int

default:"0"

Reserved for future use

update_interval

float

default:"None"

Override the transcriber’s default update interval for this stream

Returns: Stream object

add_listener

transcriber.add_listener(listener: TranscriptEventListener)

listener

TranscriptEventListener

required

Object implementing the event listener protocol with methods:

on_line_started(event: LineStarted)
on_line_updated(event: LineUpdated)
on_line_text_changed(event: LineTextChanged)
on_line_completed(event: LineCompleted)
on_error(event: Error)

remove_listener

Remove a registered event listener.

transcriber.remove_listener(listener: TranscriptEventListener)

Context Manager Support

The Transcriber class supports Python’s context manager protocol:

with Transcriber(model_path=path, model_arch=ModelArch.BASE) as transcriber:
    transcriber.start()
    transcriber.add_audio(audio_data, sample_rate)
    transcriber.stop()
# Automatically cleaned up

Example: File Transcription

from moonshine_voice import Transcriber, ModelArch, load_wav_file

# Load audio file
audio_data, sample_rate = load_wav_file("speech.wav")

# Create transcriber
transcriber = Transcriber(
    model_path="/path/to/models",
    model_arch=ModelArch.BASE
)

# Transcribe
transcript = transcriber.transcribe_without_streaming(
    audio_data,
    sample_rate
)

# Print results
for line in transcript.lines:
    print(f"[{line.start_time:.2f}s] {line.text}")

transcriber.close()

Example: Streaming with Events

from moonshine_voice import Transcriber, TranscriptEventListener

class MyListener(TranscriptEventListener):
    def on_line_started(self, event):
        print(f"Started: {event.line.text}")
    
    def on_line_text_changed(self, event):
        print(f"Updated: {event.line.text}")
    
    def on_line_completed(self, event):
        print(f"Final: {event.line.text}")
    
    def on_error(self, event):
        print(f"Error: {event.error}")

transcriber = Transcriber(
    model_path="/path/to/models",
    model_arch=ModelArch.TINY_STREAMING,
    update_interval=0.5
)

transcriber.add_listener(MyListener())

transcriber.start()
# Feed audio chunks with add_audio()
transcriber.stop()

Python API

Swift API

Java API

C++ API

C API

Class Definition

Constructor Parameters

Methods

transcribe_without_streaming

start

stop

add_audio

update_transcription

create_stream

add_listener

remove_listener

Context Manager Support

Example: File Transcription

Example: Streaming with Events

See Also

Build docs developers (and LLMs) love

Python API

Swift API

Java API

C++ API

C API

​Class Definition

​Constructor Parameters

​Methods

​transcribe_without_streaming

​start

​stop

​add_audio

​update_transcription

​create_stream

​add_listener

​remove_listener

​Context Manager Support

​Example: File Transcription

​Example: Streaming with Events

​See Also

Build docs developers (and LLMs) love

Class Definition

Constructor Parameters

Methods

transcribe_without_streaming

start

stop

add_audio

update_transcription

create_stream

add_listener

remove_listener

Context Manager Support

Example: File Transcription

Example: Streaming with Events

See Also