Download via Foundry Local
Foundry Local provides an easy way to download pre-optimized models for ONNX Runtime GenAI.Install Foundry Local
Download and install foundry-local for your platform.
Foundry Local CLI is not available on Linux at the moment. Please download the model from a Windows or macOS machine and copy it over to your Linux machine if you would like to run on Linux.
Download via Hugging Face Hub
You can download ONNX models directly from Hugging Face using the Hugging Face CLI.Download a Model
Use the For example, to download the Phi-4 mini instruct GPU model:
huggingface-cli download command with the model name and subfolder:Build Your Own Model
Alternatively, you can build your own ONNX model locally using one of these tools:Model Builder
Use the ONNX Runtime GenAI Model Builder to convert and optimize PyTorch models
Olive
Use Microsoft Olive for advanced model optimization and conversion
Next Steps
After downloading or building your model, you can:- Follow the Quickstart to run your first inference
- Learn about Runtime Options to configure your model
- Explore Constrained Decoding for structured outputs