Skip to main content
Interface for configuring audio transcription behavior. Used with transcribe(), transcribeData(), and transcribeRealtime() methods.

Properties

language
string
default:"auto"
Spoken language code (e.g., ‘en’, ‘es’, ‘fr’). Set to ‘auto’ for automatic language detection.
translate
boolean
default:"false"
Translate from source language to English. When enabled, the transcription will be translated to English regardless of the input language.
maxThreads
number
default:"2 or 4"
Number of threads to use during computation. Default is 2 for 4-core devices, 4 for devices with more cores.
nProcessors
number
default:"1"
Number of processors to use for parallel processing with whisper_full_parallel. Set to 1 to use whisper_full instead.
maxContext
number
Maximum number of text context tokens to store. Controls how much previous transcription context is retained.
maxLen
number
Maximum segment length in characters. Limits the length of individual transcription segments.
tokenTimestamps
boolean
Enable token-level timestamps. When enabled, provides more granular timestamp information for each token.
tdrzEnable
boolean
Enable tinydiarize speaker diarization. Requires a tdrz model to be loaded.
wordThold
number
Word timestamp probability threshold. Controls the confidence threshold for word-level timestamps.
offset
number
Time offset in milliseconds. Specifies where to start transcription in the audio file.
duration
number
Duration of audio to process in milliseconds. If set, only processes the specified duration from the offset.
temperature
number
Initial decoding temperature. Controls the randomness in the decoding process. Higher values increase randomness.
temperatureInc
number
Temperature increment value. Used when adjusting temperature during decoding.
beamSize
number
Beam size for beam search decoding. Larger values can improve accuracy but increase computation time.
bestOf
number
Number of best candidates to keep during decoding. Higher values may improve accuracy at the cost of performance.
prompt
string
Initial prompt text to guide transcription. Provides context to improve transcription accuracy for specific terminology or style.

Usage Example

import { initWhisper } from 'whisper.rn'

const context = await initWhisper({
  filePath: 'path/to/model.bin'
})

const options: TranscribeOptions = {
  language: 'en',
  translate: false,
  maxThreads: 4,
  tokenTimestamps: true,
  prompt: 'Technical discussion about React Native'
}

const { promise } = context.transcribe('path/to/audio.wav', options)
const result = await promise

Build docs developers (and LLMs) love