Export to ONNX

Export your trained RF-DETR model to ONNX format for deployment with inference frameworks such as ONNX Runtime and TensorRT.

Installation

Install the ONNX export dependencies:

pip install "rfdetr[onnx]"

Basic export

Object detection
Image segmentation

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export()

from rfdetr import RFDETRSegMedium

model = RFDETRSegMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export()

The model is saved to the output directory by default.

Export parameters

output_dir

string

default:"output"

Directory where the exported ONNX model will be saved.

infer_dir

string

default:"None"

Path to an image file to use for tracing. If not provided, a random dummy image is generated.

simplify

boolean

default:"False"

deprecated

Deprecated and ignored. ONNX simplification is no longer run by export().

backbone_only

boolean

default:"False"

Export only the backbone feature extractor instead of the full model.

opset_version

integer

default:"17"

ONNX opset version to use for export. Higher versions support more operations.

verbose

boolean

default:"True"

Whether to print verbose export information.

force

boolean

default:"False"

deprecated

Deprecated and ignored.

shape

tuple

default:"None"

Input shape as a (height, width) tuple. Both dimensions must be divisible by patch_size × num_windows (varies by model variant — typically 14, 16, 24, or 32). If not provided, uses the model’s default square resolution.

batch_size

integer

default:"1"

Static batch size to bake into the exported ONNX graph.

dynamic_batch

boolean

default:"False"

When True, exports with a dynamic batch dimension so the ONNX model accepts variable batch sizes at runtime instead of a fixed static size.

patch_size

integer

default:"None"

Backbone patch size used for shape-divisibility validation. Defaults to the model’s configured patch_size. When provided, must match the instantiated model’s patch size exactly.

Output files

After export, you will find the following file in your output directory:

inference_model.onnx — the exported ONNX model
backbone_model.onnx — exported instead when backbone_only=True

Advanced export examples

Custom output directory

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(output_dir="exports/my_model")

Custom input resolution

Export with a specific input resolution. Both dimensions must be divisible by patch_size × num_windows for the target model (check the model’s config for the exact value).

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(shape=(560, 560))

Backbone only

Export only the backbone feature extractor for use in custom pipelines:

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(backbone_only=True)

Convert to TensorRT

If you want lower latency on NVIDIA GPUs, convert the exported ONNX model to a TensorRT engine using the Python API.

Run TensorRT conversion on the same machine and GPU family where you plan to run inference. Engines are not portable across GPU architectures.

Prerequisites:

TensorRT installed with trtexec available in your PATH
An exported ONNX model (for example, output/inference_model.onnx)

from argparse import Namespace

from rfdetr.export.tensorrt import trtexec

args = Namespace(
    verbose=True,
    profile=False,
    dry_run=False,
)

trtexec("output/inference_model.onnx", args)

This produces output/inference_model.engine. If profile=True, it also writes an Nsight Systems report (.nsys-rep).

ONNX Runtime inference

Once exported, run inference with ONNX Runtime:

import onnxruntime as ort
import numpy as np
from PIL import Image

# Load the ONNX model
session = ort.InferenceSession("output/inference_model.onnx")

# Prepare input image
image = Image.open("image.jpg").convert("RGB")
image = image.resize((560, 560))  # Resize to model's input resolution
image_array = np.array(image).astype(np.float32) / 255.0

# Normalize
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
image_array = (image_array - mean) / std

# Convert to NCHW format
image_array = np.transpose(image_array, (2, 0, 1))
image_array = np.expand_dims(image_array, axis=0)

# Run inference
outputs = session.run(None, {"input": image_array})
boxes, labels = outputs

Get Started

Run Models

Train Models

Deploy & Export

Installation

Basic export

Export parameters

Output files

Advanced export examples

Custom output directory

Custom input resolution

Backbone only

Convert to TensorRT

ONNX Runtime inference

Next steps

Deploy to Roboflow

Training overview

Build docs developers (and LLMs) love

Get Started

Run Models

Train Models

Deploy & Export

​Installation

​Basic export

​Export parameters

​Output files

​Advanced export examples

​Custom output directory

​Custom input resolution

​Backbone only

​Convert to TensorRT

​ONNX Runtime inference

​Next steps

Deploy to Roboflow

Training overview

Build docs developers (and LLMs) love

Installation

Basic export

Export parameters

Output files

Advanced export examples

Custom output directory

Custom input resolution

Backbone only

Convert to TensorRT

ONNX Runtime inference

Next steps