Skip to main content

Understanding preprocessing

Tafrigh automatically preprocesses audio files before transcription to improve accuracy. You can customize noise reduction, filtering, and dialogue enhancement to match your audio characteristics.

Default preprocessing

By default, Tafrigh applies noise reduction with these settings:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

// Uses default preprocessing
const transcript = await transcribe('audio.mp3');
The default noise reduction includes:
  • High-pass filter at 300 Hz
  • Low-pass filter at 3000 Hz
  • FFT-based denoising from 0s to 1.5s
  • Noise floor at -20 dB
  • Dialogue enhancement enabled

Custom noise reduction

Adjust noise reduction parameters to match your audio environment:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('noisy-audio.mp3', {
  preprocessOptions: {
    noiseReduction: {
      // Isolate voice frequencies (remove low-frequency rumble)
      highpass: 200,
      // Remove high-frequency hiss
      lowpass: 3000,
      // FFT-based noise reduction timing
      afftdnStart: 1,
      afftdnStop: 2,
      // Noise floor threshold
      afftdn_nf: -25,
      // Enhance dialogue clarity
      dialogueEnhance: true,
    },
  },
});

console.log(transcript);
Parameter guide:
  • highpass - Removes low frequencies below this value (Hz). Useful for eliminating rumble and bass noise.
  • lowpass - Removes high frequencies above this value (Hz). Useful for eliminating hiss and electronic noise.
  • afftdnStart - When to start analyzing noise profile (seconds)
  • afftdnStop - When to stop analyzing noise profile (seconds)
  • afftdn_nf - Noise floor in dB. Lower values = more aggressive noise reduction
  • dialogueEnhance - Boosts midrange frequencies where speech occurs

Disabling noise reduction

For clean studio recordings, you may want to skip noise reduction:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('studio-recording.mp3', {
  preprocessOptions: {
    noiseReduction: null,
  },
});

console.log(transcript);
Disabling noise reduction speeds up processing for high-quality audio that doesn’t need enhancement.

Selective filter usage

You can disable individual filters by setting them to null:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('audio.mp3', {
  preprocessOptions: {
    noiseReduction: {
      // Only use high-pass filter, disable others
      highpass: 300,
      lowpass: null,
      afftdnStart: null,
      afftdnStop: null,
      afftdn_nf: null,
      dialogueEnhance: false,
    },
  },
});

console.log(transcript);

Aggressive noise reduction

For very noisy environments (street recordings, crowded spaces):
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('street-interview.mp3', {
  preprocessOptions: {
    noiseReduction: {
      highpass: 250,
      lowpass: 2500,
      afftdnStart: 0.5,
      afftdnStop: 3,
      afftdn_nf: -30, // More aggressive
      dialogueEnhance: true,
    },
  },
});

console.log(transcript);

Phone call quality audio

Optimize for telephone or VoIP recordings:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('phone-call.mp3', {
  preprocessOptions: {
    noiseReduction: {
      // Telephone bandwidth is typically 300-3400 Hz
      highpass: 300,
      lowpass: 3400,
      afftdnStart: 0,
      afftdnStop: 1.5,
      afftdn_nf: -20,
      dialogueEnhance: true,
    },
  },
});

console.log(transcript);

Monitoring preprocessing progress

Track preprocessing progress with callbacks:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('audio.mp3', {
  preprocessOptions: {
    noiseReduction: {
      dialogueEnhance: true,
      highpass: 200,
      lowpass: 3000,
    },
  },
  callbacks: {
    onPreprocessingStarted: async (filePath) => {
      console.log(`Starting preprocessing: ${filePath}`);
    },
    onPreprocessingProgress: async (percent) => {
      console.log(`Preprocessing progress: ${percent}%`);
    },
    onPreprocessingFinished: async (filePath) => {
      console.log(`Preprocessing complete: ${filePath}`);
    },
  },
});

console.log(transcript);

Expected output

Starting preprocessing: /tmp/tafrigh/1234567890.mp3
Preprocessing progress: 15%
Preprocessing progress: 35%
Preprocessing progress: 58%
Preprocessing progress: 82%
Preprocessing progress: 100%
Preprocessing complete: /tmp/tafrigh/1234567890.mp3

Custom splitting options

Combine preprocessing with custom audio splitting for optimal results:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

const transcript = await transcribe('long-lecture.mp3', {
  preprocessOptions: {
    noiseReduction: {
      dialogueEnhance: true,
      highpass: 250,
      lowpass: 3000,
    },
  },
  splitOptions: {
    // 60-second chunks
    chunkDuration: 60,
    // Minimum chunk length (filter out very short segments)
    chunkMinThreshold: 4,
    silenceDetection: {
      // What volume level counts as silence
      silenceThreshold: -30,
      // How long silence must last to trigger a split
      silenceDuration: 0.5,
    },
  },
});

console.log(transcript);
Adjusting silenceThreshold is important when you have background noise. A lower value (e.g., -40 dB) is stricter and works for quiet environments, while a higher value (e.g., -20 dB) is more lenient for noisy recordings.

Podcast optimization

Recommended settings for podcast transcription:
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['key1', 'key2', 'key3'] });

const transcript = await transcribe('podcast-episode.mp3', {
  concurrency: 3,
  preprocessOptions: {
    noiseReduction: {
      highpass: 100, // Keep some bass for voice warmth
      lowpass: 3500,
      afftdnStart: 0,
      afftdnStop: 1,
      afftdn_nf: -15, // Light noise reduction
      dialogueEnhance: true,
    },
  },
  splitOptions: {
    chunkDuration: 60,
    chunkMinThreshold: 2,
    silenceDetection: {
      silenceThreshold: -35,
      silenceDuration: 0.8, // Longer pauses between speakers
    },
  },
});

console.log(transcript);
For music podcasts or content with intentional audio effects, use lighter noise reduction settings or disable it entirely to preserve audio quality.

Build docs developers (and LLMs) love