ONNX Export

Overview

ONNX (Open Neural Network Exchange) is an open format for machine learning models. Exporting to ONNX enables deployment across multiple frameworks and platforms, including TensorFlow, PyTorch, and specialized inference engines.

Requirements

ONNX export requires additional dependencies:

pip install torch onnx onnxruntime

ONNX export uses PyTorch as an intermediate framework. Both torch and onnx packages must be installed.

Exporting to ONNX

Use the export_onnx_from_pytorch function to export your model:

from deployment import export_onnx_from_pytorch

layer_sizes = [784, 64, 10]
activations = ["relu", "softmax"]

onnx_path = export_onnx_from_pytorch(
    layer_sizes=layer_sizes,
    activations=activations,
    output_path="exports/model.onnx"
)

print(f"Model exported to: {onnx_path}")

Export Process

The export process:

Creates a PyTorch model with equivalent architecture
Initializes weights with the specified seed
Generates a dummy input tensor
Exports to ONNX format with dynamic batch sizes

ONNX Configuration

The export uses the following ONNX settings:

Setting	Value	Description
`opset_version`	13	ONNX operator set version
`input_names`	`["input"]`	Named input tensor
`output_names`	`["output"]`	Named output tensor
`dynamic_axes`	Batch dimension	Allows variable batch sizes

# Dynamic axes configuration
dynamic_axes = {
    "input": {0: "batch_size"},
    "output": {0: "batch_size"}
}

CLI Export

Export models using the inference CLI:

python inference.py \
  --weights checkpoints/model.npz \
  --export-onnx

This runs inference and exports the ONNX model to exports/model.onnx.

Validating ONNX Export

Validate that the exported model produces correct outputs:

from deployment import validate_onnx_export

layer_sizes = [784, 64, 10]
activations = ["relu", "softmax"]

is_valid, max_diff = validate_onnx_export(
    layer_sizes=layer_sizes,
    activations=activations,
    onnx_path="exports/model.onnx",
    seed=42
)

print(f"Valid: {is_valid}")
print(f"Max absolute difference: {max_diff}")

Validation Process

Validation compares outputs between:

PyTorch model: Reference implementation
ONNX Runtime: Exported model

The validation:

Generates random test inputs (3 samples)
Runs inference on both models
Compares outputs using np.allclose with tolerances:
- Absolute tolerance: 1e-5
- Relative tolerance: 1e-4

A maximum absolute difference below 1e-4 indicates successful export.

Running ONNX Inference

Use ONNX Runtime for production inference:

import numpy as np
import onnxruntime as ort

# Load ONNX model
session = ort.InferenceSession(
    "exports/model.onnx",
    providers=["CPUExecutionProvider"]
)

# Prepare input
X = np.random.randn(32, 784).astype(np.float32)

# Run inference
input_name = session.get_inputs()[0].name
outputs = session.run(None, {input_name: X})
predictions = outputs[0]

print(predictions.shape)  # (32, 10)

Execution Providers

ONNX Runtime supports multiple execution providers:

CPUExecutionProvider: CPU inference (default)
CUDAExecutionProvider: NVIDIA GPU acceleration
TensorrtExecutionProvider: NVIDIA TensorRT optimization
OpenVINOExecutionProvider: Intel hardware optimization

# Use GPU if available
session = ort.InferenceSession(
    "exports/model.onnx",
    providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)

Architecture Support

The ONNX exporter supports all standard architectures:

Layer Types

Fully connected (Linear) layers
All layer sizes

Activations

ReLU
Sigmoid
Softmax
Linear (identity)

Example Architectures

# Simple classifier
layer_sizes = [784, 64, 10]
activations = ["relu", "softmax"]

# Deep network
layer_sizes = [1024, 512, 256, 128, 10]
activations = ["relu", "relu", "relu", "relu", "softmax"]

# Binary classifier
layer_sizes = [100, 50, 1]
activations = ["relu", "sigmoid"]

Troubleshooting

PyTorch Not Installed

RuntimeError: Cannot export ONNX because torch is not installed

Solution: Install PyTorch:

pip install torch

ONNX Package Missing

RuntimeError: Cannot export ONNX because onnx is not installed

Solution: Install ONNX:

pip install onnx

Validation Failures

If validation shows large differences:

Check that the seed is consistent
Verify the architecture matches
Ensure numerical stability (avoid extreme values)

Small numerical differences (< 1e-4) are expected due to different implementations of operations across frameworks.

Production Deployment

Model Serving

Deploy ONNX models using popular serving frameworks:

# Example: FastAPI serving
from fastapi import FastAPI
import onnxruntime as ort
import numpy as np

app = FastAPI()
session = ort.InferenceSession("model.onnx")

@app.post("/predict")
def predict(data: list):
    x = np.array(data, dtype=np.float32)
    input_name = session.get_inputs()[0].name
    outputs = session.run(None, {input_name: x})
    return {"predictions": outputs[0].tolist()}

Optimization

Optimize ONNX models for production:

import onnx
from onnxruntime.transformers import optimizer

# Load and optimize
model = onnx.load("model.onnx")
optimized = optimizer.optimize_model(model)
optimized.save("model_optimized.onnx")

Get Started

Core Concepts

Training & Experiments

Analysis & Profiling

Deployment

Overview

Requirements

Exporting to ONNX

Export Process

ONNX Configuration

CLI Export

Validating ONNX Export

Validation Process

Running ONNX Inference

Execution Providers

Architecture Support

Layer Types

Activations

Example Architectures

Troubleshooting

PyTorch Not Installed

ONNX Package Missing

Validation Failures

Production Deployment

Model Serving

Optimization

Next Steps

Inference Guide

PyTorch Comparison

Build docs developers (and LLMs) love

Get Started

Core Concepts

Training & Experiments

Analysis & Profiling

Deployment

​Overview

​Requirements

​Exporting to ONNX

​Export Process

​ONNX Configuration

​CLI Export

​Validating ONNX Export

​Validation Process

​Running ONNX Inference

​Execution Providers

​Architecture Support

​Layer Types

​Activations

​Example Architectures

​Troubleshooting

​PyTorch Not Installed

​ONNX Package Missing

​Validation Failures

​Production Deployment

​Model Serving

​Optimization

​Next Steps

Inference Guide

PyTorch Comparison

Build docs developers (and LLMs) love

Overview

Requirements

Exporting to ONNX

Export Process

ONNX Configuration

CLI Export

Validating ONNX Export

Validation Process

Running ONNX Inference

Execution Providers

Architecture Support

Layer Types

Activations

Example Architectures

Troubleshooting

PyTorch Not Installed

ONNX Package Missing

Validation Failures

Production Deployment

Model Serving

Optimization

Next Steps