Skip to main content

Overview

The ImageProcessor class provides a complete image processing pipeline with YOLO-based object detection, filtering, and visualization capabilities. Source: arm_system/perception/vision/image_processing.py:8

Class Definition

ImageProcessor

class ImageProcessor:
    def __init__(self, confidence_threshold: float = 0.45)
Initializes image processor with detection model and confidence threshold.
confidence_threshold
float
default:"0.45"
Minimum confidence score for object detection (0.0 to 1.0)

Attributes

detection
DetectionModelInterface
YOLO detection model instance
conf_threshold
float
Confidence threshold for filtering detections

Methods

read_image_path

def read_image_path(self, path: str, draw_results: bool = True, save_drawn_img: bool = True)
Reads image from file path, processes it with object detection, and optionally draws results.
path
str
required
File path to image
draw_results
bool
default:"True"
Whether to draw bounding boxes and labels on image
save_drawn_img
bool
default:"True"
Whether to save image with drawn detections
processed_img
np.ndarray
Processed image with optional annotations
best_detection
dict
Best detection result with highest confidence, or None if no detections
Detection Dictionary Structure:
{
    'class': str,          # Object class name
    'confidence': float,   # Confidence score (0.0-1.0)
    'box': list,          # Bounding box [x1, y1, x2, y2]
    'class_id': int       # Class ID from model
}
Source: arm_system/perception/vision/image_processing.py:13

Example

from arm_system.perception.vision.image_processing import ImageProcessor

processor = ImageProcessor(confidence_threshold=0.5)

image, detection = processor.read_image_path(
    '/path/to/image.jpg',
    draw_results=True,
    save_drawn_img=True
)

if detection:
    print(f"Detected: {detection['class']}")
    print(f"Confidence: {detection['confidence']:.2f}")

process_image

def process_image(self, image: np.ndarray, confidence_threshold: float = 0.45)
Processes image with YOLO detection and returns best detection result.
image
np.ndarray
required
Input image as NumPy array (BGR format)
confidence_threshold
float
default:"0.45"
Minimum confidence threshold for detections
image
np.ndarray
Original input image
best_detection
dict
Detection with highest confidence, or None if no valid detections
Detection Process:
  1. Runs YOLO inference on image
  2. Filters detections by confidence threshold
  3. Maps detected classes to target classes (apple, orange, bottle)
  4. Returns detection with highest confidence
Supported Classes:
  • apple - Mapped from YOLO apple class
  • orange - Mapped from YOLO orange class
  • bottle - Mapped from YOLO bottle class
  • default - All other detected objects
Source: arm_system/perception/vision/image_processing.py:24

Example

import cv2
from arm_system.perception.vision.image_processing import ImageProcessor

# Load image
image = cv2.imread('object.jpg')

# Process
processor = ImageProcessor()
processed_img, detection = processor.process_image(image, confidence_threshold=0.6)

if detection and detection['confidence'] > 0:
    print(f"Found {detection['class']} with {detection['confidence']:.2%} confidence")
else:
    print("No objects detected")

Visualization

Drawing Detections

The ImageProcessor can draw bounding boxes and labels on detected objects:
def _draw_detection(self, image: np.ndarray, detection: dict)
Drawing Style:
  • Box color: Green (0, 255, 0) in BGR
  • Box thickness: 2 pixels
  • Label: {class_name} {confidence:.2f}
  • Font: HERSHEY_SIMPLEX, size 0.7
Source: arm_system/perception/vision/image_processing.py:74

Saving Annotated Images

Annotated images are saved with the suffix _detected:
def _save_drawn_image(self, image: np.ndarray, original_path: str)
Output: original_name_detected.jpg Source: arm_system/perception/vision/image_processing.py:89

Complete Example

from arm_system.perception.vision.image_processing import ImageProcessor
import cv2

# Initialize processor
processor = ImageProcessor(confidence_threshold=0.5)

# Process from file path
image, detection = processor.read_image_path(
    'captured_image.png',
    draw_results=True,
    save_drawn_img=True
)

if detection:
    print(f"\nDetection Results:")
    print(f"  Class: {detection['class']}")
    print(f"  Confidence: {detection['confidence']:.2%}")
    print(f"  Bounding Box: {detection['box']}")
    print(f"  Class ID: {detection['class_id']}")
else:
    print("No objects detected")

# Process from loaded image
raw_image = cv2.imread('another_image.jpg')
processed, result = processor.process_image(raw_image, confidence_threshold=0.6)

Integration with CommunicationManager

# In CommunicationManager
self.object_detect_model = ImageProcessor(confidence_threshold=0.45)

# During object detection
image, yolo_result = self.object_detect_model.read_image_path(
    img_path, 
    draw_results=True, 
    save_drawn_img=True
)

Error Handling

The processor handles errors gracefully:
  • Returns original image and None on detection failure
  • Logs errors for debugging
  • Continues processing even if individual detections fail

Build docs developers (and LLMs) love