DetectionModel

Overview

The DetectionModel class provides an interface to YOLO object detection models for real-time inference on robotic arm camera images. Source: arm_system/perception/vision/detection/main.py:15

Interface Definition

DetectionModelInterface

class DetectionModelInterface(ABC):
    @abstractmethod
    def inference(self, image: np.ndarray) -> Tuple[Results, Dict[int, str]]:
        pass

Abstract base class defining the detection model interface. Source: arm_system/perception/vision/detection/main.py:9

Class Definition

class DetectionModel(DetectionModelInterface):
    def __init__(self)

Initializes the detection model by loading the YOLO model via ModelLoader.

Attributes

object_model

YOLO

Loaded YOLO model instance from ModelLoader

Methods

inference

def inference(self, image: np.ndarray) -> tuple[list[Results], Dict[int, str]]

Runs object detection inference on the input image.

image

np.ndarray

required

Input image as NumPy array in BGR format (OpenCV format)

results

list[Results]

List of Ultralytics Results objects containing detection data

names

Dict[int, str]

Dictionary mapping class IDs to class names

Inference Configuration:

conf: 0.55 (confidence threshold)
verbose: False (suppress output)
imgsz: 640 (input image size)
stream: True (generator mode for memory efficiency)
task: ‘detect’ (object detection)
half: True (FP16 precision for faster inference)

Source: arm_system/perception/vision/detection/main.py:19

Results Object Structure

The Ultralytics Results object contains:

boxes: Bounding box data
- boxes.xyxy - Box coordinates [x1, y1, x2, y2]
- boxes.conf - Confidence scores
- boxes.cls - Class IDs
names: Class name dictionary

Example Usage

import cv2
import numpy as np
from arm_system.perception.vision.detection.main import DetectionModel

# Initialize model
model = DetectionModel()

# Load image
image = cv2.imread('test_image.jpg')

# Run inference
results, class_names = model.inference(image)

# Process results
for result in results:
    boxes = result.boxes
    
    if boxes.shape[0] > 0:
        for i in range(boxes.shape[0]):
            # Get detection data
            confidence = boxes.conf.cpu().numpy()[i]
            class_id = int(boxes.cls[i])
            box = boxes.xyxy.cpu().numpy()[i]
            class_name = class_names[class_id]
            
            print(f"Detected: {class_name}")
            print(f"Confidence: {confidence:.2f}")
            print(f"Box: {box}")

Integration Example

Used by ImageProcessor for object detection:

# In ImageProcessor.__init__
self.detection: DetectionModelInterface = DetectionModel()

# In ImageProcessor.process_image
object_results, object_classes = self.detection.inference(copy_image)

# Process each result
for res in object_results:
    boxes = res.boxes
    if boxes.shape[0] == 0:
        continue
    
    confidence = boxes.conf.cpu().numpy()[0]
    class_id = int(boxes.cls[0])
    box_data = boxes.xyxy.cpu().numpy()[0]
    detected_class = object_classes[class_id]

Model Configuration

The model is loaded with the following specifications:

Parameter	Value	Description
conf	0.55	Minimum confidence threshold
imgsz	640	Input image size (pixels)
half	True	FP16 precision mode
stream	True	Generator mode
task	detect	Object detection task

Performance Optimization

Half Precision (FP16): Enabled for faster inference on compatible hardware (GPUs, some CPUs). Streaming Mode: Results are returned as a generator to reduce memory usage when processing multiple images. Image Size: Fixed at 640x640 for optimal balance between speed and accuracy.

Supported Classes

The model supports all YOLO11s classes, with specific filtering for:

apple
orange
bottle

Architecture

DetectionModel
    |
    └── ModelLoader (loads YOLO11s NCNN model)
            |
            └── YOLO (Ultralytics)

Error Handling

Errors during inference are propagated to the caller (ImageProcessor), which handles them gracefully by returning None.

Core Modules

Vision System

VEX Brain

Mapping & Planning

DetectionModel

Overview

Interface Definition

DetectionModelInterface

Class Definition

DetectionModel

Attributes

Methods

inference

Results Object Structure

Example Usage

Integration Example

Model Configuration

Performance Optimization

Supported Classes

Architecture

Error Handling

Build docs developers (and LLMs) love

Core Modules

Vision System

VEX Brain

Mapping & Planning

​Overview

​Interface Definition

​DetectionModelInterface

​Class Definition

​DetectionModel

​Attributes

​Methods

​inference

​Results Object Structure

​Example Usage

​Integration Example

​Model Configuration

​Performance Optimization

​Supported Classes

​Architecture

​Error Handling

Build docs developers (and LLMs) love

Overview

Interface Definition

DetectionModelInterface

Class Definition

DetectionModel

Attributes

Methods

inference

Results Object Structure

Example Usage

Integration Example

Model Configuration

Performance Optimization

Supported Classes

Architecture

Error Handling