Overview
Theyolo-pi.py script is the main application that performs real-time object detection using a YOLO model on video input from a camera. It processes frames, identifies objects, and publishes detection results via MQTT.
Configuration Variables
Model Configuration
Path to the trained YOLO model file in HDF5 format.
Path to the anchor boxes configuration file. Contains comma-separated anchor box dimensions.
Path to the classes file. Each line contains one class name.
MQTT Configuration
MQTT broker server address. Must be set via the
MQTT environment variable.Main Function
recognize_image()
Performs object detection on a single frame and returns the annotated image.Parameters
Input image frame in BGR format (OpenCV format).
Active TensorFlow session for running the model.
Tensor output for bounding box coordinates from
yolo_eval().Tensor output for confidence scores from
yolo_eval().Tensor output for predicted class indices from
yolo_eval().Whether the model expects fixed-size input. If
True, images are resized to model_image_size. If False, images are resized to the nearest multiple of 32.Returns
Annotated PIL Image with bounding boxes, labels, and confidence scores drawn on detected objects.
Image Processing Steps
- Color Conversion: Converts BGR (OpenCV) to RGB (PIL)
- Resizing:
- Fixed size: Resizes to
model_image_sizeusing bicubic interpolation - Dynamic size: Resizes to nearest multiple of 32
- Fixed size: Resizes to
- Normalization: Divides pixel values by 255.0
- Batch Dimension: Expands dimensions to create batch of 1
Detection Output
The function processes detected objects and:- Draws bounding boxes with class-specific colors
- Adds labels with class name and confidence score
- Publishes JSON data to MQTT topic
'yolo'
MQTT Payload Format
Initialization Sequence
1. MQTT Client Setup
"yolo-pi".
2. Load Model and Configuration
3. Initialize YOLO Outputs
Minimum confidence score for detections to be considered valid.
Intersection over Union threshold for non-maximum suppression.
4. Generate Class Colors
Generates consistent colors for each class using HSV color space:Video Capture Loop
Key Points
- Captures from camera device 0 (default camera)
- Processes each frame through
recognize_image() - Press ‘q’ to quit the application
- Properly releases resources on exit
Usage Example
Dependencies
- opencv-python: Video capture and image processing
- keras: Model loading and inference
- tensorflow: Backend for Keras
- PIL: Image manipulation and drawing
- numpy: Array operations
- paho-mqtt: MQTT client

