Configuration overview
Thetranscribe function accepts an options object with three main configuration areas:
Split options
Control how Tafrigh chunks your audio files for optimal transcription quality.Chunk duration
ThechunkDuration setting determines the maximum length of each audio chunk sent to the Wit.ai API:
Actual chunk length may be shorter than
chunkDuration if Tafrigh detects that splitting at the exact duration would cut through a spoken word. The library automatically finds the nearest silence to split at.60 seconds
Trade-offs:
- Shorter chunks: More granular timestamps, faster parallel processing, but more API requests
- Longer chunks: Fewer API requests, but less granular timestamps and longer processing time per chunk
Chunk minimum threshold
Filter out chunks that are too short to contain meaningful content:0.9 seconds
This prevents very short audio segments (like brief pauses or noise) from being sent to the API.
Silence detection
Configure how Tafrigh identifies silence for intelligent chunk splitting:Silence threshold
ThesilenceThreshold defines the volume level (in decibels) below which audio is considered silence.
Default: -25 dB
Tuning guidelines:
- Noisy background: Use a lower value (e.g.,
-35dB) to only split on clear silence - Quiet environment: Use a higher value (e.g.,
-20dB) to detect subtle pauses
Silence duration
ThesilenceDuration specifies the minimum duration of silence required to trigger a split.
Default: 0.1 seconds
Tuning guidelines:
- Fast speakers with brief pauses: Use shorter duration (e.g.,
0.1s) - Speakers with longer natural pauses: Use longer duration (e.g.,
0.5s or higher)
Retry configuration
Configure automatic retry behavior for failed API requests:5 retries
Tafrigh uses exponential backoff for retries to handle transient network failures or API rate limits gracefully.
Prevent cleanup
By default, Tafrigh automatically deletes temporary directories created during processing. For debugging, you can preserve these files:false
Real-world example
Here’s a complete configuration optimized for a podcast with multiple speakers:Configuration for different scenarios
Interview or lecture
Use longer chunks (90-120s) and longer silence duration (0.8-1.5s) to capture complete thoughts without interruption.
Fast-paced conversation
Use shorter chunks (30-45s) and brief silence detection (0.1-0.3s) to handle rapid exchanges.
Next steps
Noise reduction
Deep dive into audio preprocessing options
Concurrency
Optimize processing speed with parallel transcription
Callbacks
Monitor progress with callback functions
Resuming failures
Handle partial failures gracefully