Setup
Parameters
List of predictor model types to use. Supported values:
"utmos"- UTMOS v1 model for speech quality"utmosv2"- UTMOS v2 model (requires installation viatools/install_utmosv2.sh)"dnsmos"- Deep Noise Suppression MOS"plcmos"- Packet Loss Concealment MOS"singmos_v1"- SingMOS v1 for singing voice"singmos_pro"- SingMOS Pro for singing voice"dnsmos_pro_bvcc"- DNS-MOS Pro BVCC variant"dnsmos_pro_nisqa"- DNS-MOS Pro NISQA variant"dnsmos_pro_vcc2018"- DNS-MOS Pro VCC2018 variant
Predictor-specific arguments. Example:
Directory to cache downloaded models
Whether to use GPU for computation
Returns
Dictionary mapping predictor names to model objects
Dictionary mapping predictor names to their expected sampling rates
Metric Calculation
Parameters
Audio signal to evaluate (1D array)
Sampling rate of the input audio in Hz
Dictionary of predictor models from
pseudo_mos_setup()Dictionary of predictor sampling rates from
pseudo_mos_setup()Whether to use GPU for computation
Returns
Dictionary containing MOS scores from each predictor. Possible keys:
utmos- UTMOS score (1-5 range)utmosv2- UTMOS v2 scoredns_overall- DNS-MOS overall scoredns_p808- DNS-MOS P.808 scoreplcmos- PLC-MOS scoresingmos_v1- SingMOS v1 scoresingmos_pro- SingMOS Pro scorednsmos_pro_*- DNS-MOS Pro variant scores
Usage Example
Installation Notes
UTMOSv2: Requires installation via
tools/install_utmosv2.sh and git lfs setupDNS-MOS/PLC-MOS: Requires
pip install speechmos onnxruntimeSingMOS: Automatically downloaded from torch.hub on first use
Model Details
| Predictor | Sampling Rate | Use Case | Range |
|---|---|---|---|
| UTMOS | 16 kHz | General speech quality | 1-5 |
| UTMOSv2 | 16 kHz | Enhanced speech quality | 1-5 |
| DNS-MOS | 16 kHz | Noise suppression quality | 1-5 |
| PLC-MOS | 16 kHz | Packet loss concealment | 1-5 |
| SingMOS v1 | 16 kHz | Singing voice quality | 1-5 |
| SingMOS Pro | 16 kHz | Professional singing quality | 1-5 |