Skip to main content
The Transcriber class is the core of Moonshine Voice, handling the speech-to-text pipeline including voice activity detection, audio segmentation, and transcription.

Class Definition

from moonshine_voice import Transcriber, ModelArch

transcriber = Transcriber(
    model_path: str,
    model_arch: ModelArch = ModelArch.BASE,
    update_interval: float = 0.5,
    options: dict = None
)

Constructor Parameters

model_path
str
required
Path to the directory containing model files (encoder_model.ort, decoder_model_merged.ort, tokenizer.bin)
model_arch
ModelArch
default:"ModelArch.BASE"
Model architecture to use. Options:
  • ModelArch.TINY - 26M parameters, fastest
  • ModelArch.BASE - 58M parameters, balanced
  • ModelArch.TINY_STREAMING - 34M parameters, streaming support
  • ModelArch.SMALL_STREAMING - 123M parameters, streaming support
  • ModelArch.MEDIUM_STREAMING - 245M parameters, highest accuracy
update_interval
float
default:"0.5"
How often (in seconds) to run transcription updates. Lower values provide more frequent updates but use more CPU.
options
dict
default:"None"
Advanced configuration options as string key-value pairs:

Methods

transcribe_without_streaming

Transcribe pre-recorded audio without streaming.
transcript = transcriber.transcribe_without_streaming(
    audio_data: List[float],
    sample_rate: int = 16000,
    flags: int = 0
) -> Transcript
audio_data
List[float]
required
Audio samples as mono PCM floats between -1.0 and 1.0
sample_rate
int
default:"16000"
Sample rate in Hz. The library will resample to 16kHz internally.
flags
int
default:"0"
Reserved for future use
Returns: Transcript object with finalized transcription lines

start

Begin a new streaming transcription session.
transcriber.start()
Resets the transcript and prepares for new audio input. Must be called before add_audio().

stop

End the current streaming session.
transcriber.stop()
Marks any active line as complete and calls completion event listeners.

add_audio

Add audio data to the active stream.
transcriber.add_audio(
    audio_data: List[float],
    sample_rate: int = 16000
)
audio_data
List[float]
required
Mono PCM audio samples as floats (-1.0 to 1.0). Can be any chunk size.
sample_rate
int
default:"16000"
Sample rate of the input audio

update_transcription

Manually trigger a transcription update.
transcript = transcriber.update_transcription(
    flags: int = 0
) -> Transcript
flags
int
default:"0"
Use Transcriber.MOONSHINE_FLAG_FORCE_UPDATE to bypass the 200ms cache
Returns: Current Transcript object

create_stream

Create an additional stream for processing multiple audio sources.
stream = transcriber.create_stream(
    flags: int = 0,
    update_interval: float = None
) -> Stream
flags
int
default:"0"
Reserved for future use
update_interval
float
default:"None"
Override the transcriber’s default update interval for this stream
Returns: Stream object

add_listener

Register an event listener for transcription events.
transcriber.add_listener(listener: TranscriptEventListener)
listener
TranscriptEventListener
required
Object implementing the event listener protocol with methods:
  • on_line_started(event: LineStarted)
  • on_line_updated(event: LineUpdated)
  • on_line_text_changed(event: LineTextChanged)
  • on_line_completed(event: LineCompleted)
  • on_error(event: Error)

remove_listener

Remove a registered event listener.
transcriber.remove_listener(listener: TranscriptEventListener)

Context Manager Support

The Transcriber class supports Python’s context manager protocol:
with Transcriber(model_path=path, model_arch=ModelArch.BASE) as transcriber:
    transcriber.start()
    transcriber.add_audio(audio_data, sample_rate)
    transcriber.stop()
# Automatically cleaned up

Example: File Transcription

from moonshine_voice import Transcriber, ModelArch, load_wav_file

# Load audio file
audio_data, sample_rate = load_wav_file("speech.wav")

# Create transcriber
transcriber = Transcriber(
    model_path="/path/to/models",
    model_arch=ModelArch.BASE
)

# Transcribe
transcript = transcriber.transcribe_without_streaming(
    audio_data,
    sample_rate
)

# Print results
for line in transcript.lines:
    print(f"[{line.start_time:.2f}s] {line.text}")

transcriber.close()

Example: Streaming with Events

from moonshine_voice import Transcriber, TranscriptEventListener

class MyListener(TranscriptEventListener):
    def on_line_started(self, event):
        print(f"Started: {event.line.text}")
    
    def on_line_text_changed(self, event):
        print(f"Updated: {event.line.text}")
    
    def on_line_completed(self, event):
        print(f"Final: {event.line.text}")
    
    def on_error(self, event):
        print(f"Error: {event.error}")

transcriber = Transcriber(
    model_path="/path/to/models",
    model_arch=ModelArch.TINY_STREAMING,
    update_interval=0.5
)

transcriber.add_listener(MyListener())

transcriber.start()
# Feed audio chunks with add_audio()
transcriber.stop()

See Also

Build docs developers (and LLMs) love