Resuming failed transcriptions

Transcription failures can occur due to network issues, API rate limits, or temporary service outages. Tafrigh provides robust error handling and the ability to resume failed transcriptions without losing progress.

Understanding transcription errors

When one or more chunks fail to transcribe after all retry attempts, transcribe() throws a TranscriptionError containing:

transcripts: All successfully transcribed chunks
failures: Metadata about each failed chunk (file path, index, error)
outputDir: Temporary directory where chunk files are stored
chunkFiles: Array of all chunk files that were created

class TranscriptionError extends Error {
  transcripts: Segment[];           // Successful transcriptions
  failures: FailedTranscription[];  // Failed chunk metadata
  outputDir: string;                // Temporary directory path
  chunkFiles: AudioChunk[];         // All chunks
}

Basic error handling

Catch the error and inspect what succeeded and what failed:

import { transcribe, TranscriptionError } from 'tafrigh';

try {
  const transcripts = await transcribe('large-file.mp3');
  console.log('Success:', transcripts);
} catch (error) {
  if (error instanceof TranscriptionError) {
    console.log(`Completed: ${error.transcripts.length} chunks`);
    console.log(`Failed: ${error.failures.length} chunks`);
    console.log(`Temp directory: ${error.outputDir}`);
  }
}

Resuming failed transcriptions

Use the resumeFailedTranscriptions function to retry only the failed chunks:

import { promises as fs } from 'node:fs';
import { 
  transcribe, 
  TranscriptionError, 
  resumeFailedTranscriptions 
} from 'tafrigh';

try {
  const transcripts = await transcribe('path/to/large-file.mp3');
  // All chunks completed successfully
  console.log('Transcription complete:', transcripts);
} catch (error) {
  if (error instanceof TranscriptionError) {
    // Retry only the failed chunks
    const { failures, transcripts } = await resumeFailedTranscriptions(error, {
      retries: 3,
      concurrency: 2,
    });
    
    if (failures.length === 0) {
      // Everything succeeded on retry
      console.log('All chunks transcribed:', transcripts);
    } else {
      // Some chunks still failed
      console.log(`Still failed: ${failures.length} chunks`);
    }
    
    // Clean up temporary directory
    if (error.outputDir) {
      await fs.rm(error.outputDir, { recursive: true });
    }
  }
}

How resumption works

The resumeFailedTranscriptions function:

Extracts failed chunks

Sorts failures by index and extracts the chunk metadata for each failed transcription.

Retries failed chunks

Calls transcribeAudioChunks with only the failed chunks, using your specified retry and concurrency settings.

Merges results

Combines the original successful transcripts with newly transcribed chunks and sorts them by timestamp.

Returns combined results

Returns both the merged transcripts and any remaining failures.

Here’s the implementation from src/transcriber.ts:206-224:

export const resumeFailedTranscriptions = async (
  error: Pick<TranscriptionError, 'failures' | 'transcripts'>,
  options?: ResumeOptions,
): Promise<TranscribeAudioChunksResult> => {
  const failedChunks = error.failures
    .slice()
    .sort((a, b) => a.index - b.index)
    .map((failure) => failure.chunk);

  const { failures, transcripts } = await transcribeAudioChunks(failedChunks, options);

  const combinedTranscripts = [...error.transcripts, ...transcripts];
  combinedTranscripts.sort((a: Segment, b: Segment) => a.start - b.start);

  return {
    failures,
    transcripts: combinedTranscripts,
  };
};

Resume options

Customize the retry behavior when resuming:

const result = await resumeFailedTranscriptions(error, {
  retries: 5,        // Number of retry attempts per chunk
  concurrency: 2,    // Parallel processing limit
  callbacks: {       // Progress tracking
    onTranscriptionProgress: (index) => {
      console.log(`Retrying chunk ${index}`);
    },
  },
});

Available options

retries: Number of retry attempts for each failed chunk (default: 5)
concurrency: Maximum parallel workers (limited by API keys)
callbacks: Progress and completion callbacks

Multiple retry attempts

You can retry multiple times with different strategies:

import { promises as fs } from 'node:fs';
import { 
  transcribe, 
  TranscriptionError, 
  resumeFailedTranscriptions 
} from 'tafrigh';

let currentError;

try {
  const transcripts = await transcribe('audio.mp3', { concurrency: 5 });
  console.log('Success on first attempt:', transcripts);
} catch (error) {
  if (error instanceof TranscriptionError) {
    currentError = error;
    console.log(`First attempt: ${error.failures.length} failures`);
  }
}

// First retry: lower concurrency
if (currentError) {
  try {
    const result = await resumeFailedTranscriptions(currentError, {
      concurrency: 2,
      retries: 5,
    });
    
    if (result.failures.length === 0) {
      console.log('Success on second attempt:', result.transcripts);
      currentError = null;
    } else {
      console.log(`Second attempt: ${result.failures.length} failures`);
      // Update for next retry
      currentError.failures = result.failures;
      currentError.transcripts = result.transcripts;
    }
  } catch (err) {
    console.error('Second attempt failed:', err);
  }
}

// Second retry: single-threaded with more retries
if (currentError) {
  try {
    const result = await resumeFailedTranscriptions(currentError, {
      concurrency: 1,
      retries: 10,
    });
    
    if (result.failures.length === 0) {
      console.log('Success on third attempt:', result.transcripts);
    } else {
      console.log(`Third attempt: ${result.failures.length} failures`);
      console.log('Giving up after 3 attempts');
    }
  } catch (err) {
    console.error('Third attempt failed:', err);
  }
}

// Clean up
if (currentError?.outputDir) {
  await fs.rm(currentError.outputDir, { recursive: true });
}

Temporary directory management

The temporary directory containing chunk files is preserved when failures occur so you can resume without re-processing:

You must manually delete the temporary directory when you’re finished retrying. It will not be automatically cleaned up.

import { promises as fs } from 'node:fs';

try {
  const transcripts = await transcribe('audio.mp3');
} catch (error) {
  if (error instanceof TranscriptionError) {
    // Retry logic...
    const result = await resumeFailedTranscriptions(error);
    
    // Always clean up, even if some chunks still failed
    if (error.outputDir) {
      await fs.rm(error.outputDir, { recursive: true });
      console.log(`Cleaned up ${error.outputDir}`);
    }
  }
}

When directories are auto-deleted

Temporary directories are only automatically deleted when:

preventCleanup is false (default)
No failures occurred during transcription

In all other cases, you’re responsible for cleanup.

Inspecting failures

Examine detailed failure information to diagnose issues:

try {
  const transcripts = await transcribe('audio.mp3');
} catch (error) {
  if (error instanceof TranscriptionError) {
    error.failures.forEach((failure) => {
      console.log(`Chunk ${failure.index}:`);
      console.log(`  File: ${failure.chunk.filename}`);
      console.log(`  Range: ${failure.chunk.range.start}s - ${failure.chunk.range.end}s`);
      console.log(`  Error: ${failure.error}`);
    });
  }
}

Each FailedTranscription contains:

type FailedTranscription = {
  chunk: AudioChunk;    // Chunk file path and time range
  index: number;        // Original position in chunk array
  error: unknown;       // The error that caused the failure
};

Partial results

Even when some chunks fail, you can still use the successfully transcribed portions:

try {
  const transcripts = await transcribe('audio.mp3');
  await saveTranscripts(transcripts);  // Save complete result
} catch (error) {
  if (error instanceof TranscriptionError) {
    // Save partial results immediately
    await saveTranscripts(error.transcripts);
    console.log(`Saved ${error.transcripts.length} partial segments`);
    
    // Try to get the rest
    const result = await resumeFailedTranscriptions(error);
    
    if (result.failures.length === 0) {
      // Update with complete results
      await saveTranscripts(result.transcripts);
      console.log('Saved complete transcription');
    }
  }
}

Best practices

Always clean up temporary directories

Failing to delete temporary directories will fill up disk space over time.

try {
  await transcribe('audio.mp3');
} catch (error) {
  if (error instanceof TranscriptionError) {
    try {
      await resumeFailedTranscriptions(error);
    } finally {
      // Clean up even if resume fails
      if (error.outputDir) {
        await fs.rm(error.outputDir, { recursive: true });
      }
    }
  }
}

Use exponential backoff between retry attempts

If the API is experiencing issues, waiting before retrying increases success chances.

const sleep = (ms) => new Promise(resolve => setTimeout(resolve, ms));

let retryCount = 0;
let currentError = error;

while (currentError && retryCount < 3) {
  const delay = Math.pow(2, retryCount) * 1000;  // 1s, 2s, 4s
  console.log(`Waiting ${delay}ms before retry ${retryCount + 1}`);
  await sleep(delay);
  
  const result = await resumeFailedTranscriptions(currentError);
  if (result.failures.length === 0) {
    currentError = null;
  } else {
    currentError.failures = result.failures;
    currentError.transcripts = result.transcripts;
  }
  retryCount++;
}

Log failures for debugging

Save failure information to help diagnose recurring issues.

if (error instanceof TranscriptionError) {
  const failureReport = {
    timestamp: new Date().toISOString(),
    totalChunks: error.chunkFiles.length,
    successCount: error.transcripts.length,
    failureCount: error.failures.length,
    failures: error.failures.map(f => ({
      index: f.index,
      file: f.chunk.filename,
      error: String(f.error),
    })),
  };
  
  await fs.writeFile(
    'transcription-failures.json',
    JSON.stringify(failureReport, null, 2)
  );
}

Next steps

Concurrency

Optimize parallel processing to reduce failures

Advanced configuration

Configure retry attempts and other options

Logging

Track failures with custom logging

Error handling

Complete error type reference

Getting Started

Core Concepts

Guides

Examples

Resuming failed transcriptions

Understanding transcription errors

Basic error handling

Resuming failed transcriptions

How resumption works

Resume options

Available options

Multiple retry attempts

Temporary directory management

When directories are auto-deleted

Inspecting failures

Partial results

Best practices

Next steps

Concurrency

Advanced configuration

Logging

Error handling

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Guides

Examples

​Understanding transcription errors

​Basic error handling

​Resuming failed transcriptions

​How resumption works

​Resume options

​Available options

​Multiple retry attempts

​Temporary directory management

​When directories are auto-deleted

​Inspecting failures

​Partial results

​Best practices

​Next steps

Concurrency

Advanced configuration

Logging

Error handling

Build docs developers (and LLMs) love

Understanding transcription errors

Basic error handling

Resuming failed transcriptions

How resumption works

Resume options

Available options

Multiple retry attempts

Temporary directory management

When directories are auto-deleted

Inspecting failures

Partial results

Best practices

Next steps