Skip to main content

System Requirements

Python Version

Python 3.10, 3.11, 3.12, or 3.13

Operating System

Linux, macOS, or Windows

Hardware

CPU or CUDA-compatible GPU

Disk Space

1-30 GiB per model (depends on size)

System Dependencies

Omnilingual ASR is built on fairseq2, which requires libsndfile for audio support.
Install libsndfile using Homebrew:
brew install libsndfile

Installing Omnilingual ASR

Choose your preferred installation method:
pip install omnilingual-asr

Installation Options

Basic Installation (inference only):
pip install omnilingual-asr
This installs the core dependencies:
  • fairseq2[arrow] - Core modeling framework
  • torch - Deep learning framework
  • torchaudio - Audio processing
  • pyarrow - Data handling
  • Other core dependencies (numba, pandas, numpy, kenlm, polars)

Verifying Installation

Verify that Omnilingual ASR is installed correctly:
import omnilingual_asr
from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline

print(f"Omnilingual ASR version: {omnilingual_asr.__version__}")
print("Installation successful!")

GPU Support

For GPU acceleration, ensure you have PyTorch installed with CUDA support:
1

Check CUDA Availability

import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
2

Install PyTorch with CUDA

If CUDA is not available, install PyTorch with CUDA support:
# For CUDA 11.8
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121
See PyTorch installation guide for more options.
GPU acceleration is highly recommended for faster inference, especially for larger models (3B, 7B).

Model Storage

Models are automatically downloaded on first use and stored in the fairseq2 asset cache:
~/.cache/fairseq2/assets/

Model Download Sizes

Model Family300M1B3B7B
W2V (SSL)1.2 GiB3.6 GiB12.0 GiB25.0 GiB
CTC1.3 GiB3.7 GiB12.0 GiB25.0 GiB
LLM6.1 GiB8.5 GiB17.0 GiB30.0 GiB
Ensure you have sufficient disk space before downloading larger models.

Virtual Environment Setup

We recommend using a virtual environment to avoid dependency conflicts:
# Create virtual environment
python -m venv omniasr-env

# Activate (Linux/macOS)
source omniasr-env/bin/activate

# Activate (Windows)
omniasr-env\Scripts\activate

# Install
pip install omnilingual-asr

Installing from Source

For development or to use the latest features:
# Clone the repository
git clone https://github.com/facebookresearch/omnilingual-asr.git
cd omnilingual-asr

# Install in editable mode
pip install -e .

# Or with development dependencies
pip install -e ".[dev]"

Docker Installation

For containerized environments:
Dockerfile
FROM python:3.10-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    libsndfile1 \
    && rm -rf /var/lib/apt/lists/*

# Install omnilingual-asr
RUN pip install omnilingual-asr

# Your application code
COPY . /app
WORKDIR /app

CMD ["python", "your_script.py"]
Build and Run
# Build the image
docker build -t omniasr-app .

# Run the container
docker run -v $(pwd)/models:/root/.cache/fairseq2/assets omniasr-app

Troubleshooting

Install libsndfile system dependency:macOS: brew install libsndfileLinux: sudo apt-get install libsndfile1 (Ubuntu/Debian)Windows: See fairseq2 Windows guide
Ensure your PyTorch CUDA version matches your system CUDA version:
# Check system CUDA
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(torch.version.cuda)"
Reinstall PyTorch with matching CUDA version if needed.
If model downloads fail:
  1. Check your internet connection
  2. Verify you have write permissions to ~/.cache/fairseq2/assets/
  3. Try setting a custom cache directory with sufficient space:
    export FAIRSEQ2_CACHE_DIR=/path/to/custom/cache
    
Omnilingual ASR requires Python 3.10-3.13. Check your version:
python --version
Install a supported Python version using your system package manager or pyenv.

Next Steps

Quick Start

Start transcribing audio files in minutes

Supported Languages

Explore the 1600+ supported languages

Model Selection

Choose the right model for your use case

API Reference

Explore the full API documentation

Build docs developers (and LLMs) love