Skip to main content
Export your trained RF-DETR model to ONNX format for deployment with inference frameworks such as ONNX Runtime and TensorRT.

Installation

Install the ONNX export dependencies:
pip install "rfdetr[onnx]"

Basic export

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export()
The model is saved to the output directory by default.

Export parameters

output_dir
string
default:"output"
Directory where the exported ONNX model will be saved.
infer_dir
string
default:"None"
Path to an image file to use for tracing. If not provided, a random dummy image is generated.
simplify
boolean
default:"False"
deprecated
Deprecated and ignored. ONNX simplification is no longer run by export().
backbone_only
boolean
default:"False"
Export only the backbone feature extractor instead of the full model.
opset_version
integer
default:"17"
ONNX opset version to use for export. Higher versions support more operations.
verbose
boolean
default:"True"
Whether to print verbose export information.
force
boolean
default:"False"
deprecated
Deprecated and ignored.
shape
tuple
default:"None"
Input shape as a (height, width) tuple. Both dimensions must be divisible by patch_size × num_windows (varies by model variant — typically 14, 16, 24, or 32). If not provided, uses the model’s default square resolution.
batch_size
integer
default:"1"
Static batch size to bake into the exported ONNX graph.
dynamic_batch
boolean
default:"False"
When True, exports with a dynamic batch dimension so the ONNX model accepts variable batch sizes at runtime instead of a fixed static size.
patch_size
integer
default:"None"
Backbone patch size used for shape-divisibility validation. Defaults to the model’s configured patch_size. When provided, must match the instantiated model’s patch size exactly.

Output files

After export, you will find the following file in your output directory:
  • inference_model.onnx — the exported ONNX model
  • backbone_model.onnx — exported instead when backbone_only=True

Advanced export examples

Custom output directory

from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(output_dir="exports/my_model")

Custom input resolution

Export with a specific input resolution. Both dimensions must be divisible by patch_size × num_windows for the target model (check the model’s config for the exact value).
from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(shape=(560, 560))

Backbone only

Export only the backbone feature extractor for use in custom pipelines:
from rfdetr import RFDETRMedium

model = RFDETRMedium(pretrain_weights="<path/to/checkpoint.pth>")

model.export(backbone_only=True)

Convert to TensorRT

If you want lower latency on NVIDIA GPUs, convert the exported ONNX model to a TensorRT engine using the Python API.
Run TensorRT conversion on the same machine and GPU family where you plan to run inference. Engines are not portable across GPU architectures.
Prerequisites:
  • TensorRT installed with trtexec available in your PATH
  • An exported ONNX model (for example, output/inference_model.onnx)
from argparse import Namespace

from rfdetr.export.tensorrt import trtexec

args = Namespace(
    verbose=True,
    profile=False,
    dry_run=False,
)

trtexec("output/inference_model.onnx", args)
This produces output/inference_model.engine. If profile=True, it also writes an Nsight Systems report (.nsys-rep).

ONNX Runtime inference

Once exported, run inference with ONNX Runtime:
import onnxruntime as ort
import numpy as np
from PIL import Image

# Load the ONNX model
session = ort.InferenceSession("output/inference_model.onnx")

# Prepare input image
image = Image.open("image.jpg").convert("RGB")
image = image.resize((560, 560))  # Resize to model's input resolution
image_array = np.array(image).astype(np.float32) / 255.0

# Normalize
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
image_array = (image_array - mean) / std

# Convert to NCHW format
image_array = np.transpose(image_array, (2, 0, 1))
image_array = np.expand_dims(image_array, axis=0)

# Run inference
outputs = session.run(None, {"input": image_array})
boxes, labels = outputs

Next steps

Deploy to Roboflow

Deploy your fine-tuned model to Roboflow for cloud inference.

Training overview

Learn how to fine-tune RF-DETR on your own dataset.

Build docs developers (and LLMs) love