TranscribeOptions

Interface for configuring audio transcription behavior. Used with transcribe(), transcribeData(), and transcribeRealtime() methods.

Properties

language

string

default:"auto"

Spoken language code (e.g., ‘en’, ‘es’, ‘fr’). Set to ‘auto’ for automatic language detection.

translate

boolean

default:"false"

Translate from source language to English. When enabled, the transcription will be translated to English regardless of the input language.

maxThreads

number

default:"2 or 4"

Number of threads to use during computation. Default is 2 for 4-core devices, 4 for devices with more cores.

nProcessors

number

default:"1"

Number of processors to use for parallel processing with whisper_full_parallel. Set to 1 to use whisper_full instead.

maxContext

number

Maximum number of text context tokens to store. Controls how much previous transcription context is retained.

maxLen

number

Maximum segment length in characters. Limits the length of individual transcription segments.

tokenTimestamps

boolean

Enable token-level timestamps. When enabled, provides more granular timestamp information for each token.

tdrzEnable

boolean

Enable tinydiarize speaker diarization. Requires a tdrz model to be loaded.

wordThold

number

Word timestamp probability threshold. Controls the confidence threshold for word-level timestamps.

offset

number

Time offset in milliseconds. Specifies where to start transcription in the audio file.

duration

number

Duration of audio to process in milliseconds. If set, only processes the specified duration from the offset.

temperature

number

Initial decoding temperature. Controls the randomness in the decoding process. Higher values increase randomness.

temperatureInc

number

Temperature increment value. Used when adjusting temperature during decoding.

beamSize

number

Beam size for beam search decoding. Larger values can improve accuracy but increase computation time.

bestOf

number

Number of best candidates to keep during decoding. Higher values may improve accuracy at the cost of performance.

prompt

string

Initial prompt text to guide transcription. Provides context to improve transcription accuracy for specific terminology or style.

Usage Example

import { initWhisper } from 'whisper.rn'

const context = await initWhisper({
  filePath: 'path/to/model.bin'
})

const options: TranscribeOptions = {
  language: 'en',
  translate: false,
  maxThreads: 4,
  tokenTimestamps: true,
  prompt: 'Technical discussion about React Native'
}

const { promise } = context.transcribe('path/to/audio.wav', options)
const result = await promise

TranscribeResult - The result object returned from transcription
transcribe() - Transcribe audio files
transcribeData() - Transcribe audio data

Core API

Voice Activity Detection

Realtime Transcription

Types & Interfaces

Utilities

Properties

Usage Example

Build docs developers (and LLMs) love

Core API

Voice Activity Detection

Realtime Transcription

Types & Interfaces

Utilities

​Properties

​Usage Example

​Related

Build docs developers (and LLMs) love

Properties

Usage Example

Related