Noise reduction and audio preprocessing

Tafrigh includes sophisticated audio preprocessing capabilities to enhance transcription accuracy, especially for recordings with background noise or poor audio quality.

Noise reduction overview

By default, Tafrigh applies noise reduction and dialogue enhancement to improve transcription quality. All preprocessing is handled automatically using ffmpeg filters.

import { init, transcribe } from 'tafrigh';

init({ apiKeys: ['your-wit-ai-key'] });

// Default preprocessing is applied automatically
const transcript = await transcribe('noisy-audio.mp3');

Configuring noise reduction

You can customize noise reduction settings to match your audio characteristics:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 1,        // Start noise reduction at 1 second
      afftdnStop: 1.5,       // Stop noise reduction at 1.5 seconds
      afftdn_nf: -25,        // Noise floor in dB
      dialogueEnhance: true, // Enhance speech clarity
      highpass: 200,         // High-pass filter at 200 Hz
      lowpass: 3000,         // Low-pass filter at 3000 Hz
    },
  },
};

const transcript = await transcribe('audio.mp3', options);

FFT-based denoiser

The FFT denoiser learns the noise profile from a sample of your audio and removes it from the entire recording.

Noise learning window

The afftdnStart and afftdnStop parameters define the time window used to learn the noise profile:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 0,    // Start learning from beginning
      afftdnStop: 2,     // Learn for first 2 seconds
    },
  },
};

Default: afftdnStart: 0, afftdnStop: 1.5

Choose a section of your audio that contains only background noise (no speech) for best results. The first 1-2 seconds of a recording often work well.

Noise floor adjustment

The afftdn_nf parameter controls the noise floor threshold in decibels:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdn_nf: -30,  // More aggressive noise reduction
    },
  },
};

Default: -20 dB Tuning guidelines:

Heavy background noise: Use -30 or lower
Light background noise: Use -20 or higher
Very quiet recordings: Use -15 or higher

Setting afftdn_nf too low may introduce artifacts or remove parts of the speech signal.

Frequency filtering

Tafrigh uses high-pass and low-pass filters to isolate voice frequencies and remove unwanted noise.

High-pass filter

Removes low-frequency rumble and background noise below human speech:

const options = {
  preprocessOptions: {
    noiseReduction: {
      highpass: 300,  // Filter out frequencies below 300 Hz
    },
  },
};

Default: 300 Hz Common values:

Male voices: 200 Hz
Female voices: 300 Hz
High background rumble: 400 Hz
Disable: Set to null

Low-pass filter

Removes high-frequency noise above human speech:

const options = {
  preprocessOptions: {
    noiseReduction: {
      lowpass: 3000,  // Filter out frequencies above 3000 Hz
    },
  },
};

Default: 3000 Hz Common values:

Telephone quality: 3400 Hz
Broadcast quality: 3000 Hz
Full range: 8000 Hz or higher
Disable: Set to null

Dialogue enhancement

Dialogue enhancement boosts midrange frequencies where human speech is most prominent:

const options = {
  preprocessOptions: {
    noiseReduction: {
      dialogueEnhance: true,
    },
  },
};

Default: true This is especially useful for:

Recordings with background music
Multiple speakers at different volumes
Low-quality microphones
Compressed audio formats

Disabling noise reduction

For high-quality recordings or when preprocessing is unwanted, you can disable noise reduction entirely:

const options = {
  preprocessOptions: {
    noiseReduction: null,  // Skip noise reduction completely
  },
};

const transcript = await transcribe('studio-quality.wav', options);

Selectively disabling filters

You can disable individual filters while keeping others active:

const options = {
  preprocessOptions: {
    noiseReduction: {
      highpass: null,           // Disable high-pass filter
      lowpass: null,            // Disable low-pass filter
      afftdnStart: null,        // Disable FFT denoiser
      afftdnStop: null,
      dialogueEnhance: true,    // Keep dialogue enhancement
    },
  },
};

Preset configurations

Studio quality
Podcast
Street interview
Phone call

For high-quality recordings, use minimal processing:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: null,
      afftdnStop: null,
      afftdn_nf: null,
      highpass: null,
      lowpass: null,
      dialogueEnhance: false,
    },
  },
};

Balanced settings for podcasts with some background noise:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 1,
      afftdnStop: 1.5,
      afftdn_nf: -25,
      highpass: 200,
      lowpass: 3000,
      dialogueEnhance: true,
    },
  },
};

Aggressive noise reduction for very noisy environments:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 0,
      afftdnStop: 2,
      afftdn_nf: -35,
      highpass: 400,
      lowpass: 2500,
      dialogueEnhance: true,
    },
  },
};

Optimized for telephone audio:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 0.5,
      afftdnStop: 1,
      afftdn_nf: -20,
      highpass: 300,
      lowpass: 3400,
      dialogueEnhance: true,
    },
  },
};

Monitoring preprocessing

Use callbacks to track preprocessing progress:

const options = {
  preprocessOptions: {
    noiseReduction: {
      afftdnStart: 1,
      afftdnStop: 1.5,
      afftdn_nf: -25,
      dialogueEnhance: true,
    },
  },
  callbacks: {
    onPreprocessingStarted: async (filePath) => {
      console.log(`Starting preprocessing: ${filePath}`);
    },
    onPreprocessingProgress: async (percent) => {
      console.log(`Preprocessing: ${percent}% complete`);
    },
    onPreprocessingFinished: async (filePath) => {
      console.log(`Preprocessing complete: ${filePath}`);
    },
  },
};

const transcript = await transcribe('audio.mp3', options);

See the Callbacks reference for more details.

Audio normalization

Tafrigh automatically normalizes audio chunks by:

Adding silence padding at chunk boundaries
Normalizing volume levels across chunks
Applying configured noise reduction filters

This ensures consistent transcription quality even when the source audio has variable volume levels.

Next steps

Advanced configuration

Fine-tune chunk duration and silence detection

Callbacks

Monitor all stages of the transcription pipeline

Concurrency

Speed up processing with parallel transcription

Getting Started

Core Concepts

Guides

Examples

Noise reduction and audio preprocessing

Noise reduction overview

Configuring noise reduction

FFT-based denoiser

Noise learning window

Noise floor adjustment

Frequency filtering

High-pass filter

Low-pass filter

Dialogue enhancement

Disabling noise reduction

Selectively disabling filters

Preset configurations

Monitoring preprocessing

Audio normalization

Next steps

Advanced configuration

Callbacks

Concurrency

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Guides

Examples

​Noise reduction overview

​Configuring noise reduction

​FFT-based denoiser

​Noise learning window

​Noise floor adjustment

​Frequency filtering

​High-pass filter

​Low-pass filter

​Dialogue enhancement

​Disabling noise reduction

​Selectively disabling filters

​Preset configurations

​Monitoring preprocessing

​Audio normalization

​Next steps

Advanced configuration

Callbacks

Concurrency

Build docs developers (and LLMs) love

Noise reduction overview

Configuring noise reduction

FFT-based denoiser

Noise learning window

Noise floor adjustment

Frequency filtering

High-pass filter

Low-pass filter

Dialogue enhancement

Disabling noise reduction

Selectively disabling filters

Preset configurations

Monitoring preprocessing

Audio normalization

Next steps