Ultralytics YOLO
The system uses the Ultralytics library for YOLO inference:requirements.txt:279.
ModelLoader Class
TheModelLoader class handles YOLO model initialization (arm_system/perception/vision/detection/model_loader.py:7):
Model Format
The system uses the NCNN format for optimized edge inference:- Model Path:
arm_system/perception/vision/detection/models/yolo11s_ncnn_model - Format: NCNN (Tencent’s neural network inference framework)
- Task: Object detection
NCNN Model Structure
The NCNN model consists of two files:Supported Model Formats
While the production system uses NCNN, YOLO models support multiple formats:| Format | Description | Use Case |
|---|---|---|
| PyTorch (.pt) | Native training format | Training and development |
| ONNX (.onnx) | Cross-platform format | General deployment |
| NCNN | Tencent framework | Mobile and edge devices |
| MNN | Alibaba framework | Edge computing |
Inference Parameters
The model runs inference with optimized parameters (arm_system/perception/vision/detection/main.py:20):
Parameter Details
conf=0.55
Minimum confidence threshold for detections. Only objects with confidence >= 0.55 are returned.imgsz=640
Input image size for the model. Images are resized to 640x640 pixels before inference. This is the standard YOLO11s input size.half=True
Enables FP16 (half-precision floating point) inference for:- Faster inference speed (approximately 2x)
- Reduced memory usage
- Minimal accuracy loss
Model Specifications
Based on the NCNN metadata (course/vision_class/export/models/ncnn/metadata.yaml:1):
- Model: YOLO11s (small variant)
- Training Dataset: COCO
- Ultralytics Version: 8.3.77
- Stride: 32
- Batch Size: 1
- Input Size: 640x640
- Classes: 80 COCO classes
Detected Classes
The model is trained on 80 COCO classes including:Real-Time Inference
TheDetectionModel.inference() method integrates with the YOLO model:
Results Object
The UltralyticsResults object contains:
- boxes: Bounding box coordinates and confidences
- boxes.xyxy: Box coordinates in (x1, y1, x2, y2) format
- boxes.conf: Confidence scores
- boxes.cls: Class IDs
Performance Optimization
NCNN Backend
NCNN provides:- Optimized inference for ARM processors
- Low memory footprint
- No GPU dependency
- Fast CPU inference
Half Precision
FP16 inference (half=True) provides:
- 2x faster inference on compatible hardware
- 50% reduction in memory usage
- Minimal accuracy impact (less than 1% mAP loss)
Streaming Mode
Streaming mode (stream=True) enables:
- Memory-efficient batch processing
- Lower latency for single images
- Better resource management