Skip to main content

Overview

The Segment and Token types represent transcribed audio with timing information. A Segment represents a complete phrase or sentence, while Token represents individual words within that segment.

Segment

Represents a segment of transcribed audio with timing information. May include detailed token-level (word-by-word) information when available.

Type definition

type Segment = Token & {
    tokens?: Token[];
};

Properties

text
string
required
The transcribed text for this segment.
start
number
required
Start time of the segment in seconds.
end
number
required
End time of the segment in seconds.
confidence
number
Confidence score for this transcription (between 0 and 1).Higher values indicate greater confidence in the transcription accuracy.
tokens
Token[]
Word-by-word breakdown of the transcription with individual timings.

Token

Represents a token (word or phrase) in the transcription with timing information.

Type definition

type Token = TimeRange & {
    confidence?: number;
    text: string;
};

Properties

text
string
required
The transcribed text (word or phrase).
start
number
required
Start time in seconds.Inherited from TimeRange type (from ffmpeg-simplified).
end
number
required
End time in seconds.Inherited from TimeRange type (from ffmpeg-simplified).
confidence
number
Confidence score for this transcription (between 0 and 1).

Usage examples

import { transcribe } from 'tafrigh';

const segments = await transcribe('audio.mp3');

// Access segment data
segments.forEach((segment) => {
  console.log(`[${segment.start}s - ${segment.end}s]: ${segment.text}`);
  console.log(`Confidence: ${segment.confidence}`);
});

Response structure example

[
  {
    "text": "Hello world",
    "start": 0,
    "end": 2.5,
    "confidence": 0.95,
    "tokens": [
      {
        "text": "Hello",
        "start": 0,
        "end": 1.2,
        "confidence": 0.97
      },
      {
        "text": "world",
        "start": 1.3,
        "end": 2.5,
        "confidence": 0.93
      }
    ]
  },
  {
    "text": "This is a test",
    "start": 3.0,
    "end": 5.5,
    "confidence": 0.88
  }
]

transcribe

Returns an array of Segment objects

Callbacks

onTranscriptionFinished receives Segment[]

Error handling

TranscriptionError includes partial segments

Resume failed transcriptions

Continue from partial transcriptions

Build docs developers (and LLMs) love