Overview
YOLO-Pi provides real-time object detection on video streams from USB cameras. This guide explains how to configure the model paths, run inference, and understand the recognition pipeline.
Recognition Pipeline
The YOLO-Pi inference pipeline consists of several stages:
Video Capture
Capture frames from USB camera using OpenCV: vc = cv2.VideoCapture( 0 )
rval, frame = vc.read()
Source: src/yolo-pi.py:162-163
Image Preprocessing
Convert and resize the image for model input: cv2_im = cv2.cvtColor(image, cv2. COLOR_BGR2RGB )
image = Image.fromarray(cv2_im)
image_data = np.array(resized_image, dtype = 'float32' )
image_data /= 255 . # Normalize to [0, 1]
image_data = np.expand_dims(image_data, 0 ) # Add batch dimension
Source: src/yolo-pi.py:34-49
Model Inference
Run the YOLO model to detect objects: out_boxes, out_scores, out_classes = sess.run(
[boxes, scores, classes],
feed_dict = {
yolo_model.input: image_data,
input_image_shape: [image.size[ 1 ], image.size[ 0 ]],
K.learning_phase(): 0
})
Source: src/yolo-pi.py:50-56
Post-processing
Filter detections, draw bounding boxes, and publish results via MQTT.
Configuring Model Paths
The yolo-pi.py script uses hardcoded paths for model configuration. Edit these lines to switch between models:
Tiny YOLO VOC (Default)
model_path = 'model_data/tiny-yolo-voc.h5'
anchors_path = 'model_data/tiny-yolo-voc_anchors.txt'
classes_path = 'model_data/pascal_classes.txt'
Source: src/yolo-pi.py:108-110
Full YOLO with COCO
For the full YOLO model, uncomment these lines:
model_path = 'model_data/yolo.h5'
anchors_path = 'model_data/yolo_anchors.txt'
classes_path = 'model_data/coco_classes.txt'
Source: src/yolo-pi.py:111-113
Ensure all three files (model, anchors, classes) match the same YOLO configuration. Mismatched files will cause assertion errors.
The recognize_image() Function
The core detection logic is implemented in the recognize_image() function:
def recognize_image ( image , sess , boxes , scores , classes , is_fixed_size ):
# Convert BGR to RGB
cv2_im = cv2.cvtColor(image, cv2. COLOR_BGR2RGB )
image = Image.fromarray(cv2_im)
# Resize image based on model requirements
if is_fixed_size:
resized_image = image.resize(
tuple ( reversed (model_image_size)), Image. BICUBIC )
image_data = np.array(resized_image, dtype = 'float32' )
else :
new_image_size = (image.width - (image.width % 32 ),
image.height - (image.height % 32 ))
resized_image = image.resize(new_image_size, Image. BICUBIC )
image_data = np.array(resized_image, dtype = 'float32' )
# Normalize pixel values
image_data /= 255 .
image_data = np.expand_dims(image_data, 0 )
# Run inference
out_boxes, out_scores, out_classes = sess.run(
[boxes, scores, classes],
feed_dict = {
yolo_model.input: image_data,
input_image_shape: [image.size[ 1 ], image.size[ 0 ]],
K.learning_phase(): 0
})
# Process detections...
return image
Source: src/yolo-pi.py:33-103
Function Parameters
image : OpenCV frame from video capture (BGR format)
sess : Keras backend session
boxes : Tensor for bounding box coordinates
scores : Tensor for confidence scores
classes : Tensor for class predictions
is_fixed_size : Boolean indicating if the model requires fixed-size input
Model Initialization
Before running inference, the model and associated data must be loaded:
# Load class names
with open (classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
# Load anchors
with open (anchors_path) as f:
anchors = f.readline()
anchors = [ float (x) for x in anchors.split( ',' )]
anchors = np.array(anchors).reshape( - 1 , 2 )
# Load Keras model
yolo_model = load_model(model_path)
num_classes = len (class_names)
num_anchors = len (anchors)
Source: src/yolo-pi.py:114-125
Model Validation
The script validates that the model architecture matches the anchor and class configurations:
model_output_channels = yolo_model.layers[ - 1 ].output_shape[ - 1 ]
assert model_output_channels == num_anchors * (num_classes + 5 ), \
'Mismatch between model and given anchor and class sizes. ' \
'Specify matching anchors and classes with --anchors_path and ' \
'--classes_path flags.'
Source: src/yolo-pi.py:127-131
Each anchor predicts (num_classes + 5) values: 4 for bounding box coordinates, 1 for confidence, and num_classes for class probabilities.
Drawing Bounding Boxes
Detected objects are visualized with labeled bounding boxes:
for i, c in reversed ( list ( enumerate (out_classes))):
predicted_class = class_names[c]
box = out_boxes[i]
score = out_scores[i]
label = ' {} {:.2f} ' .format(predicted_class, score)
draw = ImageDraw.Draw(image)
# Get box coordinates
top, left, bottom, right = box
top = max ( 0 , np.floor(top + 0.5 ).astype( 'int32' ))
left = max ( 0 , np.floor(left + 0.5 ).astype( 'int32' ))
bottom = min (image.size[ 1 ], np.floor(bottom + 0.5 ).astype( 'int32' ))
right = min (image.size[ 0 ], np.floor(right + 0.5 ).astype( 'int32' ))
# Draw rectangle with class-specific color
for i in range (thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline = colors[c])
# Draw label background and text
draw.rectangle(
[ tuple (text_origin), tuple (text_origin + label_size)],
fill = colors[c])
draw.text(text_origin, label, fill = ( 0 , 0 , 0 ), font = font)
Source: src/yolo-pi.py:66-96
Color Generation
Each class is assigned a unique color using HSV color space:
hsv_tuples = [(x / len (class_names), 1 ., 1 .)
for x in range ( len (class_names))]
colors = list ( map ( lambda x : colorsys.hsv_to_rgb( * x), hsv_tuples))
colors = list (
map ( lambda x : ( int (x[ 0 ] * 255 ), int (x[ 1 ] * 255 ), int (x[ 2 ] * 255 )),
colors))
random.seed( 10101 ) # Fixed seed for consistent colors across runs
random.shuffle(colors)
Source: src/yolo-pi.py:139-147
Score Threshold
Control detection sensitivity by adjusting the confidence threshold:
boxes, scores, classes = yolo_eval(
yolo_outputs,
input_image_shape,
score_threshold = .3 , # Only keep detections with confidence > 0.3
iou_threshold = .5 ) # NMS threshold for overlapping boxes
Source: src/yolo-pi.py:154-158
Tuning Guidelines:
Lower threshold (0.1-0.3) : More detections, higher false positive rate
Medium threshold (0.3-0.5) : Balanced precision and recall (default: 0.3)
Higher threshold (0.5-0.9) : Fewer detections, higher precision
IOU Threshold
The Intersection over Union (IOU) threshold controls Non-Maximum Suppression (NMS):
iou_threshold = .5 # Suppress boxes with IOU > 0.5
Lower IOU : More aggressive suppression, fewer overlapping boxes
Higher IOU : Less suppression, may keep multiple boxes for same object
The yolo_eval() function is defined in src/yad2k/models/keras_yolo.py:323-349 and handles box filtering and non-maximum suppression.
Running the Application
Starting Inference
Run the YOLO-Pi script to start real-time detection:
Main Detection Loop
while True :
if frame is not None :
pil_image = recognize_image(frame, sess, boxes, scores, classes, is_fixed_size)
open_cv_image = np.array(pil_image)
# Optionally display: cv2.imshow("preview", open_cv_image)
rval, frame = vc.read()
i = cv2.waitKey( 1 )
if i & 0x FF == ord ( 'q' ): # Press 'q' to quit
sess.close()
break
vc.release()
cv2.destroyAllWindows()
Source: src/yolo-pi.py:168-184
The display window (cv2.imshow) is commented out by default. Uncomment line 174 to visualize detections locally.
Environment Variables
YOLO-Pi requires the MQTT server environment variable:
export MQTT = mqtt . example . com
python yolo-pi.py
See the MQTT Integration guide for details.
Raspberry Pi 3 Tiny YOLO VOC
~0.5-1 FPS
Requires swap space for compilation
Best with USB camera at 640x480
MacBook Pro Tiny YOLO VOC
~0.5 FPS (1 frame per 2 seconds)
Suitable for development and testing
Can handle higher resolutions
The full YOLO model is significantly slower than Tiny YOLO. For real-time applications on Raspberry Pi, use Tiny YOLO VOC.
Troubleshooting
Camera Not Found
If the video capture fails:
if rval == False :
print ( "Can't read video capture. Exiting." )
sys.exit( 1 )
Solutions:
Verify camera is connected: ls /dev/video*
Try different camera index: cv2.VideoCapture(1)
Check camera permissions
Model Loading Errors
Ensure all three files match:
# Check files exist
ls model_data/tiny-yolo-voc.h5
ls model_data/tiny-yolo-voc_anchors.txt
ls model_data/pascal_classes.txt
Memory Issues
For Raspberry Pi deployments:
Use Tiny YOLO instead of full YOLO
Set up swap space (see setup guide)
Consider reducing input image resolution
Next Steps
Model Conversion Learn how to convert different YOLO models
MQTT Integration Set up MQTT messaging for detection events