Overview
The distributed evaluation system automatically:- Splits your dataset into chunks
- Submits parallel Slurm jobs for each chunk
- Handles GPU and CPU resource allocation
- Aggregates results from all jobs
Quick Start
Basic Usage
Run distributed evaluation with ground truth:Without Ground Truth
For reference-free evaluation:Launch Script Arguments
Path to prediction wav.scp file containing utterance IDs and audio paths.
Path to ground truth wav.scp file. Use
"None" if not available for reference-free evaluation.Directory to store all results, logs, and intermediate files.
Number of chunks to split the data into. More chunks = more parallel jobs.
Optional Flags
Run only CPU jobs (skip GPU metrics).
Run only GPU jobs (skip CPU metrics).
Path to text transcription file for WER/CER metrics.
Configuration
Environment Variables
Customize resource allocation using environment variables:Complete Workflow
Launch Jobs
Submit the distributed evaluation:This creates:
results/experiment1/pred/- Split prediction filesresults/experiment1/gt/- Split reference filesresults/experiment1/logs/- Job logsresults/experiment1/result/- Partial resultsresults/experiment1/job_ids.txt- Job tracking
Advanced Examples
GPU-Only Large Scale Evaluation
Multi-Language with Transcriptions
CPU-Only for Basic Metrics
Kaldi/ESPnet Integration
Directory Structure
After launching, your score directory contains:Job Tracking
Thejob_ids.txt file tracks all submitted jobs:
Dependent Job Submission
Create jobs that run after evaluation completes:Troubleshooting
Jobs failing immediately
Jobs failing immediately
Check the error logs:Common issues:
- Incorrect partition names
- GPU type not available
- File paths not accessible from compute nodes
- Missing dependencies
Jobs pending for long time
Jobs pending for long time
Check resource availability:Solutions:
- Reduce resource requirements (CPUS, MEM)
- Use different partition
- Split into more chunks with shorter time limits
Out of memory errors
Out of memory errors
Increase memory allocation:Or reduce chunk size:
GPU out of memory
GPU out of memory
Solutions:
- Request more GPUs per job (modify
egs/run_gpu.sh) - Use GPU with more memory:
export GPU_TYPE=a100 - Reduce batch size in config files
- Process in smaller chunks
Performance Optimization
Optimal Chunk Size
Balance parallelism and overhead:
- Small datasets (< 1000 files): 5-10 chunks
- Medium datasets (1k-10k): 20-50 chunks
- Large datasets (> 10k): 50-200 chunks
CPU vs GPU Split
Use both by default:
- GPU: Neural metrics (UTMOS, Speaker Similarity)
- CPU: Traditional metrics (PESQ, STOI, MCD)
- Both run in parallel for maximum efficiency
Resource Allocation
Right-size your resources:
- GPU jobs: 8-16 CPUs, 4-8GB RAM per CPU
- CPU jobs: 8 CPUs, 2-4GB RAM per CPU
- Longer jobs need more conservative estimates
I/O Optimization
Choose the right I/O method:
soundfile: Direct file access (simple)kaldi: Efficient for large datasetsdir: Easiest for directory of files
Best Practices
Next Steps
- Learn about visualization tools to analyze distributed results
- Explore CLI usage to understand individual job execution
- Check Python API for custom metric integration