Skip to main content
Pseudo-MOS metrics provide automated estimation of Mean Opinion Score (MOS) for audio quality assessment. The system supports multiple predictor models including UTMOS, DNS-MOS, PLC-MOS, and SingMOS variants.

Setup

pseudo_mos_setup(
    predictor_types,
    predictor_args,
    cache_dir="versa_cache",
    use_gpu=False
)

Parameters

predictor_types
list[str]
required
List of predictor model types to use. Supported values:
  • "utmos" - UTMOS v1 model for speech quality
  • "utmosv2" - UTMOS v2 model (requires installation via tools/install_utmosv2.sh)
  • "dnsmos" - Deep Noise Suppression MOS
  • "plcmos" - Packet Loss Concealment MOS
  • "singmos_v1" - SingMOS v1 for singing voice
  • "singmos_pro" - SingMOS Pro for singing voice
  • "dnsmos_pro_bvcc" - DNS-MOS Pro BVCC variant
  • "dnsmos_pro_nisqa" - DNS-MOS Pro NISQA variant
  • "dnsmos_pro_vcc2018" - DNS-MOS Pro VCC2018 variant
predictor_args
dict
Predictor-specific arguments. Example:
{
    "dnsmos": {"fs": 16000},
    "plcmos": {"fs": 16000}
}
cache_dir
str
default:"versa_cache"
Directory to cache downloaded models
use_gpu
bool
default:"false"
Whether to use GPU for computation

Returns

predictor_dict
dict
Dictionary mapping predictor names to model objects
predictor_fs
dict
Dictionary mapping predictor names to their expected sampling rates

Metric Calculation

pseudo_mos_metric(
    pred,
    fs,
    predictor_dict,
    predictor_fs,
    use_gpu=False
)

Parameters

pred
numpy.ndarray
required
Audio signal to evaluate (1D array)
fs
int
required
Sampling rate of the input audio in Hz
predictor_dict
dict
required
Dictionary of predictor models from pseudo_mos_setup()
predictor_fs
dict
required
Dictionary of predictor sampling rates from pseudo_mos_setup()
use_gpu
bool
default:"false"
Whether to use GPU for computation

Returns

scores
dict
Dictionary containing MOS scores from each predictor. Possible keys:
  • utmos - UTMOS score (1-5 range)
  • utmosv2 - UTMOS v2 score
  • dns_overall - DNS-MOS overall score
  • dns_p808 - DNS-MOS P.808 score
  • plcmos - PLC-MOS score
  • singmos_v1 - SingMOS v1 score
  • singmos_pro - SingMOS Pro score
  • dnsmos_pro_* - DNS-MOS Pro variant scores

Usage Example

import numpy as np
from versa import pseudo_mos_setup, pseudo_mos_metric

# Setup predictors
predictor_dict, predictor_fs = pseudo_mos_setup(
    predictor_types=["utmos", "dnsmos", "plcmos"],
    predictor_args={
        "dnsmos": {"fs": 16000},
        "plcmos": {"fs": 16000}
    },
    use_gpu=True
)

# Load your audio
audio = np.random.random(16000)  # Replace with actual audio
fs = 16000

# Calculate MOS scores
scores = pseudo_mos_metric(
    audio,
    fs=fs,
    predictor_dict=predictor_dict,
    predictor_fs=predictor_fs,
    use_gpu=True
)

print(f"UTMOS: {scores['utmos']:.2f}")
print(f"DNS Overall: {scores['dns_overall']:.2f}")
print(f"PLC-MOS: {scores['plcmos']:.2f}")

Installation Notes

UTMOSv2: Requires installation via tools/install_utmosv2.sh and git lfs setup
DNS-MOS/PLC-MOS: Requires pip install speechmos onnxruntime
SingMOS: Automatically downloaded from torch.hub on first use

Model Details

PredictorSampling RateUse CaseRange
UTMOS16 kHzGeneral speech quality1-5
UTMOSv216 kHzEnhanced speech quality1-5
DNS-MOS16 kHzNoise suppression quality1-5
PLC-MOS16 kHzPacket loss concealment1-5
SingMOS v116 kHzSinging voice quality1-5
SingMOS Pro16 kHzProfessional singing quality1-5

Build docs developers (and LLMs) love