Installation

ONNX Runtime GenAI can be installed via package managers for Python and C#, or downloaded as binaries for C++. Choose the installation method that matches your development environment.

System Requirements

Before installing, ensure your system meets these requirements:

General
GPU Acceleration

Operating System: Windows, Linux, or macOS
Architecture: x64, x86, or arm64
Python: 3.8 or later (for Python API)
.NET: .NET 8.0 or later (for C# API)
C++ Compiler: MSVC, GCC, or Clang (for C++ API)

Python Installation

Install NumPy

NumPy is required for tensor operations:

pip install numpy

Install ONNX Runtime GenAI

Install the latest stable release:

pip install onnxruntime-genai

Use --pre flag to install pre-release versions:

pip install --pre onnxruntime-genai

Verify Installation

Verify the installation by checking the version:

pip list | grep onnxruntime-genai

Or in Python:

import onnxruntime_genai as og
print("ONNX Runtime GenAI installed successfully")

Nightly Builds (Python)

To install the latest nightly build with cutting-edge features:

pip install --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-genai

Nightly builds are experimental and may be unstable. Use stable releases for production applications.

C# Installation

Add NuGet Package

Add the ONNX Runtime GenAI package to your project:

dotnet add package Microsoft.ML.OnnxRuntimeGenAI

Choose Execution Provider (Optional)

For GPU acceleration, use the appropriate package:

CUDA
DirectML

<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.12.0" />

<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.DirectML" Version="0.12.0" />

Verify Installation

Add this to your C# code to verify:

using Microsoft.ML.OnnxRuntimeGenAI;

Console.WriteLine("ONNX Runtime GenAI installed successfully");

The C# package supports .NET 8.0, .NET Standard 2.0, and mobile platforms (Android, iOS, Mac Catalyst).

C++ Installation

Download Binaries

Download the pre-built binaries for your platform from the GitHub Releases page.Choose the appropriate package:

Windows: onnxruntime-genai-win-x64-{version}.zip
Linux: onnxruntime-genai-linux-x64-{version}.tar.gz
macOS: onnxruntime-genai-osx-{arch}-{version}.tar.gz

Extract and Configure

Windows
Linux
macOS

Extract the archive to your desired location
Add the bin directory to your PATH
Link against the library in your CMake or Visual Studio project:

find_package(onnxruntime-genai REQUIRED)
target_link_libraries(your_app onnxruntime-genai)

Extract the archive:

tar -xzf onnxruntime-genai-linux-x64-{version}.tar.gz

Set library path:

export LD_LIBRARY_PATH=/path/to/onnxruntime-genai/lib:$LD_LIBRARY_PATH

Link in your build system:

find_package(onnxruntime-genai REQUIRED)
target_link_libraries(your_app onnxruntime-genai)

Extract the archive:

tar -xzf onnxruntime-genai-osx-{arch}-{version}.tar.gz

Set library path:

export DYLD_LIBRARY_PATH=/path/to/onnxruntime-genai/lib:$DYLD_LIBRARY_PATH

Link in your build system:

find_package(onnxruntime-genai REQUIRED)
target_link_libraries(your_app onnxruntime-genai)

Verify Installation

Create a simple test program:

#include <onnxruntime_genai.h>
#include <iostream>

int main() {
    std::cout << "ONNX Runtime GenAI installed successfully" << std::endl;
    return 0;
}

Build from Source (C++)

For advanced users who need custom builds:

python build.py

See the build from source guide for detailed instructions.

Platform-Specific Notes

Windows

Visual C++ Redistributable 2019 or later is required
For DirectML, ensure Windows 10 version 1903 or later
CUDA builds require CUDA toolkit to be installed

Linux

GLIBC 2.27 or later is required
For CUDA support, install CUDA 11.8+ and cuDNN
Ubuntu 20.04+ and CentOS 8+ are officially supported

macOS

macOS 11.0 (Big Sur) or later
Both Intel (x64) and Apple Silicon (arm64) are supported
No GPU acceleration on macOS (CPU only)

Android

Android API level 27 or higher
Available through .NET MAUI or build from source
QNN execution provider for Qualcomm devices

Troubleshooting

Import Error (Python)

If you encounter import errors, ensure:

NumPy is installed: pip install numpy
You’re using Python 3.8 or later
The package matches your platform architecture (x64/arm64)

# Check Python version
python --version

# Reinstall with verbose output
pip install --force-reinstall -v onnxruntime-genai

DLL/Shared Library Not Found

For C++ applications, ensure:

The library path is correctly set (LD_LIBRARY_PATH, PATH, or DYLD_LIBRARY_PATH)
All dependencies are installed (CUDA, cuDNN for GPU builds)
The architecture matches (x64 vs arm64)

Version Mismatch

If using examples from the repository, ensure they match your installed version:

# Get installed version
pip list | grep onnxruntime-genai

# Clone and checkout matching version
git clone https://github.com/microsoft/onnxruntime-genai.git
cd onnxruntime-genai
git checkout v0.12.0  # Replace with your version

Next Steps

Quickstart Guide

Now that you have ONNX Runtime GenAI installed, follow the quickstart guide to run your first model.

Get Started

Core Concepts

Guides

Multi-Modal

Hardware Acceleration

Installation

Installation

System Requirements

Python Installation

Nightly Builds (Python)

C# Installation

C++ Installation

Build from Source (C++)

Platform-Specific Notes

Troubleshooting

Next Steps

Quickstart Guide

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Multi-Modal

Hardware Acceleration

​Installation

​System Requirements

​Python Installation

​Nightly Builds (Python)

​C# Installation

​C++ Installation

​Build from Source (C++)

​Platform-Specific Notes

​Troubleshooting

​Next Steps

Quickstart Guide

Build docs developers (and LLMs) love

Installation

System Requirements

Python Installation

Nightly Builds (Python)

C# Installation

C++ Installation

Build from Source (C++)

Platform-Specific Notes

Troubleshooting

Next Steps