detect.py script runs inference on test images and exports bounding box detections to a CSV file for downstream THz beam steering applications.
Prerequisites
Train the Model
You need a trained model checkpoint. Follow the Training Guide to train YOLO26s on the drone dataset.The detection script expects the best checkpoint at:
Prepare Test Images
Test images should be in the
data/images/test/ directory. If you followed the Dataset Setup guide, this is already configured.Running Detection
Basic Usage
- Load the trained model from
runs/drone_detect/weights/best.pt - Run inference on all images in
data/images/test/ - Save detections to
output/detections.csv - Save annotated images to
output/annotated/
Expected Output
Configuration
The detection configuration is defined at the top ofdetect.py:
Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
MODEL | runs/drone_detect/weights/best.pt | Path to trained model checkpoint |
IMAGE_DIR | data/images/test | Directory containing test images |
OUTPUT_DIR | output | Directory for results (CSV + annotated images) |
CONF | 0.4 | Confidence threshold (0-1). Only detections above this score are kept |
IMGSZ | 960 | Input image size (must match training) |
Inference Parameters
The prediction call uses optimized settings for A100 GPUs:Parameter Reference
| Parameter | Value | Description |
|---|---|---|
source | data/images/test | Input image directory |
conf | 0.4 | Confidence threshold for filtering detections |
imgsz | 960 | Image size for inference (height) |
save | True | Save annotated images with bounding boxes drawn |
project | output | Project directory for saving results |
name | annotated | Subdirectory name for annotated images |
exist_ok | True | Overwrite existing output directory |
half | True | Use FP16 (half precision) for 2× faster inference on GPU |
batch | 16 | Process 16 images per batch (adjust based on GPU memory) |
FP16 Inference:
half=True uses half-precision floating point (FP16) which is 2× faster on modern GPUs with minimal accuracy loss. Requires a GPU with FP16 support (Pascal architecture or newer).Output Format
CSV Structure
Detections are saved tooutput/detections.csv with the following columns:
| Column | Type | Description | Example |
|---|---|---|---|
image | string | Source image filename | image_BS1_1234_17_56_02.jpg |
x_center | float | Bounding box center X coordinate (pixels) | 519.23 |
y_center | float | Bounding box center Y coordinate (pixels) | 329.19 |
width | float | Bounding box width (pixels) | 45.0 |
height | float | Bounding box height (pixels) | 41.04 |
confidence | float | Detection confidence score (0-1) | 0.9234 |
class | string | Object class name | drone |
Bounding Box Format
The output uses center-based coordinates (x_center, y_center, width, height) in pixels. This matches the YOLO internal format.Converting to corner coordinates
Converting to corner coordinates
If you need top-left corner coordinates (x1, y1, x2, y2), use:
Annotated Images
Annotated images with bounding boxes drawn are saved tooutput/annotated/:
- Green bounding boxes around detected drones
- Confidence scores above each box
- Class label (“drone”)
Processing Detections
The script extracts bounding box coordinates from YOLO results and writes them to CSV:Accessing Detection Results
Each result object contains:| Attribute | Description |
|---|---|
r.path | Source image path |
r.boxes | Tensor of detected bounding boxes (None if no detections) |
r.boxes.xywh | Bounding boxes in (x_center, y_center, width, height) format |
r.boxes.xyxy | Bounding boxes in (x1, y1, x2, y2) format |
r.boxes.conf | Confidence scores |
r.boxes.cls | Class indices |
r.names | Dictionary mapping class indices to names |
GPU Optimizations
Like the training script, detection applies A100-specific optimizations:Adjusting Confidence Threshold
The confidence threshold (CONF = 0.4) controls the precision/recall tradeoff:
How to choose the right threshold
How to choose the right threshold
- Lower threshold (e.g., 0.2): More detections, higher recall, but more false positives
- Higher threshold (e.g., 0.6): Fewer detections, higher precision, but may miss some drones
- Default (0.4): Balanced setting that works well for most scenarios
CONF variable in detect.py:12:Batch Size Tuning
The default batch size is 16 images. Adjust based on GPU memory:| GPU VRAM | Recommended Batch Size |
|---|---|
| 4 GB | 4-8 |
| 8 GB | 8-16 |
| 12+ GB | 16-32 |
| 40 GB (A100) | 32-64 |
batch in the predict call:
Using Detections for Beam Steering
The CSV output is designed for THz beam steering applications. Each detection provides:- Spatial coordinates: (x_center, y_center) for pointing the beam
- Drone size: (width, height) for estimating distance or filtering by target size
- Confidence: For filtering low-quality detections
Troubleshooting
Model file not found
Model file not found
Problem:
FileNotFoundError: runs/drone_detect/weights/best.ptSolution: Train the model first using python train.py. The detection script requires a trained checkpoint.No detections in CSV
No detections in CSV
Possible Causes:
- Confidence threshold too high (try lowering
CONFfrom 0.4 to 0.2) - Model not trained properly (check validation mAP in training output)
- Test images don’t contain drones
- Wrong model checkpoint (verify you’re using
best.pt, not untrained weights)
CUDA out of memory during inference
CUDA out of memory during inference
Solutions:
- Reduce
batchfrom 16 to 8 or 4 - Disable
half=True(slower but uses less memory) - Process images one at a time with
batch=1
Slow inference on CPU
Slow inference on CPU
Problem: Inference takes several seconds per imageSolution: Ensure you have a CUDA-capable GPU and PyTorch with CUDA support installed. Check with:
Next Steps
- Integrate detections with your THz beam steering controller
- Experiment with different confidence thresholds
- Run detection on live video streams using Ultralytics’ video inference mode