Model Export

YOLO models can be exported to various formats optimized for different deployment scenarios. The course materials include export scripts and examples for ONNX, NCNN, and MNN formats.

Export Script

The basic export script is located in course/vision_class/export/export_model.py:1:

from ultralytics import YOLO

model = YOLO('models/torch/yolo11s.pt')

model.export(format="mnn")

This script loads a PyTorch model and exports it to the specified format.

Model Export Process

The export process follows these steps:

Load PyTorch Model: Load the trained .pt model file
Configure Export: Specify target format and parameters
Run Export: Ultralytics handles format conversion
Verify Output: Test exported model with sample inference

Export Command

Using the Ultralytics API:

from ultralytics import YOLO

# Load trained model
model = YOLO('yolo11s.pt')

# Export to specific format
model.export(format="ncnn")  # or "onnx", "mnn", etc.

Supported Export Formats

The course materials include examples for three optimized formats:

ONNX Format

Location: course/vision_class/export/models/onnx/

model.export(format="onnx")

Output:

yolo11s.onnx (38 MB)

Characteristics:

Cross-platform compatibility
Hardware acceleration support
Standard inference engines (ONNX Runtime)

Use Cases:

Cloud deployment
Server inference
Cross-platform applications

NCNN Format

Location: course/vision_class/export/models/ncnn/

model.export(format="ncnn")

Output:

model.ncnn.param (23 KB) - Architecture definition
model.ncnn.bin (38 MB) - Model weights
metadata.yaml - Model metadata

Characteristics:

Optimized for ARM processors
No external dependencies
Efficient CPU inference

Use Cases:

Mobile devices (Android/iOS)
Embedded systems
Edge computing (Raspberry Pi, Jetson)

MNN Format

Location: course/vision_class/export/models/mnn/

model.export(format="mnn")

Output:

yolo11s.mnn (38 MB)

Characteristics:

Alibaba’s lightweight framework
Mobile-optimized inference
Low power consumption

Use Cases:

Mobile applications
IoT devices
Resource-constrained environments

NCNN Model Details

The production system uses NCNN format. The metadata file (course/vision_class/export/models/ncnn/metadata.yaml:1) contains:

description: Ultralytics YOLO11s model trained on coco.yaml
author: Ultralytics
date: '2025-04-06T11:00:06.430757'
version: 8.3.77
license: AGPL-3.0 License
stride: 32
task: detect
batch: 1
imgsz:
  - 640
  - 640
names:
  0: person
  1: bicycle
  # ... 78 more classes
  47: apple
  49: orange
  39: bottle
  # ...
args:
  batch: 1
  half: false

NCNN Inference Example

The course includes an NCNN inference test (course/vision_class/export/models/ncnn/model_ncnn.py:5):

import numpy as np
import ncnn
import torch

def test_inference():
    torch.manual_seed(0)
    in0 = torch.rand(1, 3, 640, 640, dtype=torch.float)
    out = []

    with ncnn.Net() as net:
        # Load NCNN model files
        net.load_param("models/torch/yolo11s_ncnn_model/model.ncnn.param")
        net.load_model("models/torch/yolo11s_ncnn_model/model.ncnn.bin")

        with net.create_extractor() as ex:
            # Input tensor
            ex.input("in0", ncnn.Mat(in0.squeeze(0).numpy()).clone())

            # Extract output
            _, out0 = ex.extract("out0")
            out.append(torch.from_numpy(np.array(out0)).unsqueeze(0))

    if len(out) == 1:
        return out[0]
    else:
        return tuple(out)

Optimization for Edge Devices

Exported models are optimized for edge deployment through:

Model Quantization

NCNN and MNN support quantization for:

Reduced model size (4x smaller with INT8)
Faster inference (2-4x speedup)
Lower power consumption

Hardware Acceleration

NCNN: Vulkan GPU acceleration, ARM NEON optimization
MNN: Metal (iOS), OpenCL, Vulkan support
ONNX: CUDA, TensorRT, DirectML acceleration

Memory Optimization

Edge-optimized formats provide:

In-place operations to reduce memory
Operator fusion for efficiency
Optimized memory allocation

Export Best Practices

1. Test Inference

Always verify exported model accuracy:

from ultralytics import YOLO

# Original model
original = YOLO('yolo11s.pt')
original_results = original('test_image.jpg')

# Export model
original.export(format='ncnn')

# Load exported model
exported = YOLO('yolo11s_ncnn_model')
exported_results = exported('test_image.jpg')

# Compare results

2. Choose Appropriate Format

Deployment	Recommended Format
Raspberry Pi	NCNN
Android/iOS	NCNN or MNN
Cloud/Server	ONNX
Jetson Nano	ONNX (TensorRT)
Web Browser	ONNX (ONNX.js)

3. Consider Trade-offs

ONNX: Best compatibility, larger size
NCNN: Best for ARM CPUs, requires compilation
MNN: Good mobile performance, smaller ecosystem

Integration with System

The robotic arm system loads the NCNN model in model_loader.py:10:

object_model_path: str = current_path + '/models/yolo11s_ncnn_model'
self.model: YOLO = YOLO(object_model_path, task='detect')

Ultralytics automatically detects the NCNN format and uses the appropriate backend for inference.

Getting Started

Hardware Setup

Software Architecture

Computer Vision

Tutorials

Export Script

Model Export Process

Export Command

Supported Export Formats

ONNX Format

NCNN Format

MNN Format

NCNN Model Details

NCNN Inference Example

Optimization for Edge Devices

Model Quantization

Hardware Acceleration

Memory Optimization

Export Best Practices

1. Test Inference

2. Choose Appropriate Format

3. Consider Trade-offs

Integration with System

Build docs developers (and LLMs) love

Getting Started

Hardware Setup

Software Architecture

Computer Vision

Tutorials

​Export Script

​Model Export Process

​Export Command

​Supported Export Formats

​ONNX Format

​NCNN Format

​MNN Format

​NCNN Model Details

​NCNN Inference Example

​Optimization for Edge Devices

​Model Quantization

​Hardware Acceleration

​Memory Optimization

​Export Best Practices

​1. Test Inference

​2. Choose Appropriate Format

​3. Consider Trade-offs

​Integration with System

Build docs developers (and LLMs) love

Export Script

Model Export Process

Export Command

Supported Export Formats

ONNX Format

NCNN Format

MNN Format

NCNN Model Details

NCNN Inference Example

Optimization for Edge Devices

Model Quantization

Hardware Acceleration

Memory Optimization

Export Best Practices

1. Test Inference

2. Choose Appropriate Format

3. Consider Trade-offs

Integration with System