Skip to main content
Tafrigh includes sophisticated audio preprocessing capabilities to enhance transcription accuracy, especially for recordings with background noise or poor audio quality.

Noise reduction overview

By default, Tafrigh applies noise reduction and dialogue enhancement to improve transcription quality. All preprocessing is handled automatically using ffmpeg filters.
import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

// Default preprocessing is applied automatically
const transcript = await transcribe('noisy-audio.mp3');

Configuring noise reduction

You can customize noise reduction settings to match your audio characteristics:
const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 1,        // Start noise reduction at 1 second
      afftdnStop: 1.5,       // Stop noise reduction at 1.5 seconds
      afftdn_nf: -25,        // Noise floor in dB
      dialogueEnhance: true, // Enhance speech clarity
      highpass: 200,         // High-pass filter at 200 Hz
      lowpass: 3000,         // Low-pass filter at 3000 Hz
    },
  },
};

const transcript = await transcribe('audio.mp3', options);

FFT-based denoiser

The FFT denoiser learns the noise profile from a sample of your audio and removes it from the entire recording.

Noise learning window

The afftdnStart and afftdnStop parameters define the time window used to learn the noise profile:
const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 0,    // Start learning from beginning
      afftdnStop: 2,     // Learn for first 2 seconds
    },
  },
};
Default: afftdnStart: 0, afftdnStop: 1.5
Choose a section of your audio that contains only background noise (no speech) for best results. The first 1-2 seconds of a recording often work well.

Noise floor adjustment

The afftdn_nf parameter controls the noise floor threshold in decibels:
const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdn_nf: -30,  // More aggressive noise reduction
    },
  },
};
Default: -20 dB Tuning guidelines:
  • Heavy background noise: Use -30 or lower
  • Light background noise: Use -20 or higher
  • Very quiet recordings: Use -15 or higher
Setting afftdn_nf too low may introduce artifacts or remove parts of the speech signal.

Frequency filtering

Tafrigh uses high-pass and low-pass filters to isolate voice frequencies and remove unwanted noise.

High-pass filter

Removes low-frequency rumble and background noise below human speech:
const options = {
  preprocessOptions: {
    noiseReduction: {
      highpass: 300,  // Filter out frequencies below 300 Hz
    },
  },
};
Default: 300 Hz Common values:
  • Male voices: 200 Hz
  • Female voices: 300 Hz
  • High background rumble: 400 Hz
  • Disable: Set to null

Low-pass filter

Removes high-frequency noise above human speech:
const options = {
  preprocessOptions: {
    noiseReduction: {
      lowpass: 3000,  // Filter out frequencies above 3000 Hz
    },
  },
};
Default: 3000 Hz Common values:
  • Telephone quality: 3400 Hz
  • Broadcast quality: 3000 Hz
  • Full range: 8000 Hz or higher
  • Disable: Set to null

Dialogue enhancement

Dialogue enhancement boosts midrange frequencies where human speech is most prominent:
const options = {
  preprocessOptions: {
    noiseReduction: {
      dialogueEnhance: true,
    },
  },
};
Default: true This is especially useful for:
  • Recordings with background music
  • Multiple speakers at different volumes
  • Low-quality microphones
  • Compressed audio formats

Disabling noise reduction

For high-quality recordings or when preprocessing is unwanted, you can disable noise reduction entirely:
const options = {
  preprocessOptions: {
    noiseReduction: null,  // Skip noise reduction completely
  },
};

const transcript = await transcribe('studio-quality.wav', options);

Selectively disabling filters

You can disable individual filters while keeping others active:
const options = {
  preprocessOptions: {
    noiseReduction: {
      highpass: null,           // Disable high-pass filter
      lowpass: null,            // Disable low-pass filter
      afftdnStart: null,        // Disable FFT denoiser
      afftdnStop: null,
      dialogueEnhance: true,    // Keep dialogue enhancement
    },
  },
};

Preset configurations

For high-quality recordings, use minimal processing:
const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: null,
      afftdnStop: null,
      afftdn_nf: null,
      highpass: null,
      lowpass: null,
      dialogueEnhance: false,
    },
  },
};

Monitoring preprocessing

Use callbacks to track preprocessing progress:
const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 1,
      afftdnStop: 1.5,
      afftdn_nf: -25,
      dialogueEnhance: true,
    },
  },
  callbacks: {
    onPreprocessingStarted: async (filePath) => {
      console.log(`Starting preprocessing: ${filePath}`);
    },
    onPreprocessingProgress: async (percent) => {
      console.log(`Preprocessing: ${percent}% complete`);
    },
    onPreprocessingFinished: async (filePath) => {
      console.log(`Preprocessing complete: ${filePath}`);
    },
  },
};

const transcript = await transcribe('audio.mp3', options);
See the Callbacks reference for more details.

Audio normalization

Tafrigh automatically normalizes audio chunks by:
  1. Adding silence padding at chunk boundaries
  2. Normalizing volume levels across chunks
  3. Applying configured noise reduction filters
This ensures consistent transcription quality even when the source audio has variable volume levels.

Next steps

Advanced configuration

Fine-tune chunk duration and silence detection

Callbacks

Monitor all stages of the transcription pipeline

Concurrency

Speed up processing with parallel transcription

Build docs developers (and LLMs) love