Function Signature
Parameters
The path to the audio file to open. Supports any audio format that FFmpeg can decode.
The sample rate to resample the audio if necessary. Defaults to
SAMPLE_RATE (16000 Hz).Returns
A NumPy array containing the audio waveform in
float32 dtype. Values are normalized to the range [-1.0, 1.0].Example
Implementation Details
FFmpeg Dependency
This function requires the FFmpeg CLI to be available in your system PATH. It launches a subprocess with the following operations:- Decodes the input audio file
- Down-mixes to mono (
-ac 1) - Resamples to the specified sample rate
- Outputs as 16-bit PCM (
-f s16le)
Normalization
The raw 16-bit PCM output is converted to float32 and normalized by dividing by 32768.0, mapping the integer range [-32768, 32767] to the float range [-1.0, 1.0].Audio Constants
The default sample rate and other audio constants used in Whisper:Error Handling
RuntimeError if FFmpeg fails to decode the audio file, with the stderr output from FFmpeg included in the error message.