transcribe(), transcribeData(), and transcribeRealtime() methods.
Properties
Spoken language code (e.g., ‘en’, ‘es’, ‘fr’). Set to ‘auto’ for automatic language detection.
Translate from source language to English. When enabled, the transcription will be translated to English regardless of the input language.
Number of threads to use during computation. Default is 2 for 4-core devices, 4 for devices with more cores.
Number of processors to use for parallel processing with
whisper_full_parallel. Set to 1 to use whisper_full instead.Maximum number of text context tokens to store. Controls how much previous transcription context is retained.
Maximum segment length in characters. Limits the length of individual transcription segments.
Enable token-level timestamps. When enabled, provides more granular timestamp information for each token.
Enable tinydiarize speaker diarization. Requires a tdrz model to be loaded.
Word timestamp probability threshold. Controls the confidence threshold for word-level timestamps.
Time offset in milliseconds. Specifies where to start transcription in the audio file.
Duration of audio to process in milliseconds. If set, only processes the specified duration from the offset.
Initial decoding temperature. Controls the randomness in the decoding process. Higher values increase randomness.
Temperature increment value. Used when adjusting temperature during decoding.
Beam size for beam search decoding. Larger values can improve accuracy but increase computation time.
Number of best candidates to keep during decoding. Higher values may improve accuracy at the cost of performance.
Initial prompt text to guide transcription. Provides context to improve transcription accuracy for specific terminology or style.
Usage Example
Related
- TranscribeResult - The result object returned from transcription
- transcribe() - Transcribe audio files
- transcribeData() - Transcribe audio data