Running Detection

The detect.py script runs inference on test images and exports bounding box detections to a CSV file for downstream THz beam steering applications.

Prerequisites

Train the Model

You need a trained model checkpoint. Follow the Training Guide to train YOLO26s on the drone dataset.The detection script expects the best checkpoint at:

runs/drone_detect/weights/best.pt

Prepare Test Images

Test images should be in the data/images/test/ directory. If you followed the Dataset Setup guide, this is already configured.

Basic Usage

python detect.py

The script will:

Load the trained model from runs/drone_detect/weights/best.pt
Run inference on all images in data/images/test/
Save detections to output/detections.csv
Save annotated images to output/annotated/

Expected Output

1247 detections saved to output/detections.csv

Configuration

The detection configuration is defined at the top of detect.py:

from pathlib import Path
from ultralytics import YOLO

SCRIPT_DIR = Path(__file__).resolve().parent
MODEL = str(SCRIPT_DIR / "runs" / "drone_detect" / "weights" / "best.pt")
IMAGE_DIR = SCRIPT_DIR / "data" / "images" / "test"
OUTPUT_DIR = SCRIPT_DIR / "output"
CONF = 0.4
IMGSZ = 960

Configuration Parameters

Parameter	Default	Description
`MODEL`	`runs/drone_detect/weights/best.pt`	Path to trained model checkpoint
`IMAGE_DIR`	`data/images/test`	Directory containing test images
`OUTPUT_DIR`	`output`	Directory for results (CSV + annotated images)
`CONF`	`0.4`	Confidence threshold (0-1). Only detections above this score are kept
`IMGSZ`	`960`	Input image size (must match training)

To use a different model checkpoint (e.g., last.pt or a specific epoch), modify the MODEL variable:

MODEL = str(SCRIPT_DIR / "runs" / "drone_detect" / "weights" / "last.pt")

Inference Parameters

The prediction call uses optimized settings for A100 GPUs:

results = model.predict(
    source=str(IMAGE_DIR),
    conf=CONF,
    imgsz=IMGSZ,
    save=True,
    project=str(OUTPUT_DIR),
    name="annotated",
    exist_ok=True,
    half=True,
    batch=16,
)

Parameter Reference

Parameter	Value	Description
`source`	`data/images/test`	Input image directory
`conf`	`0.4`	Confidence threshold for filtering detections
`imgsz`	`960`	Image size for inference (height)
`save`	`True`	Save annotated images with bounding boxes drawn
`project`	`output`	Project directory for saving results
`name`	`annotated`	Subdirectory name for annotated images
`exist_ok`	`True`	Overwrite existing output directory
`half`	`True`	Use FP16 (half precision) for 2× faster inference on GPU
`batch`	`16`	Process 16 images per batch (adjust based on GPU memory)

FP16 Inference: half=True uses half-precision floating point (FP16) which is 2× faster on modern GPUs with minimal accuracy loss. Requires a GPU with FP16 support (Pascal architecture or newer).

Output Format

CSV Structure

Detections are saved to output/detections.csv with the following columns:

Column	Type	Description	Example
`image`	string	Source image filename	`image_BS1_1234_17_56_02.jpg`
`x_center`	float	Bounding box center X coordinate (pixels)	`519.23`
`y_center`	float	Bounding box center Y coordinate (pixels)	`329.19`
`width`	float	Bounding box width (pixels)	`45.0`
`height`	float	Bounding box height (pixels)	`41.04`
`confidence`	float	Detection confidence score (0-1)	`0.9234`
`class`	string	Object class name	`drone`

image,x_center,y_center,width,height,confidence,class
image_BS1_9998_17_56_02.jpg,519.23,329.19,45.0,41.04,0.9234,drone
image_BS1_9992_17_56_01.jpg,487.56,318.72,48.12,43.87,0.8876,drone
image_BS1_9967_17_55_57.jpg,502.89,335.41,44.23,40.15,0.9512,drone

Bounding Box Format

The output uses center-based coordinates (x_center, y_center, width, height) in pixels. This matches the YOLO internal format.

Converting to corner coordinates

If you need top-left corner coordinates (x1, y1, x2, y2), use:

x1 = x_center - width / 2
y1 = y_center - height / 2
x2 = x_center + width / 2
y2 = y_center + height / 2

Annotated Images

Annotated images with bounding boxes drawn are saved to output/annotated/:

output/
├── detections.csv
└── annotated/
    ├── image_BS1_9998_17_56_02.jpg
    ├── image_BS1_9992_17_56_01.jpg
    └── ...

Each annotated image shows:

Green bounding boxes around detected drones
Confidence scores above each box
Class label (“drone”)

Processing Detections

The script extracts bounding box coordinates from YOLO results and writes them to CSV:

import csv

with open(csv_path, "w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["image", "x_center", "y_center", "width", "height", "confidence", "class"])

    results = model.predict(
        source=str(IMAGE_DIR), conf=CONF, imgsz=IMGSZ,
        save=True, project=str(OUTPUT_DIR), name="annotated",
        exist_ok=True, half=True, batch=16,
    )
    
    for r in results:
        name = Path(r.path).name
        if r.boxes is not None and len(r.boxes):
            for box in r.boxes:
                cx, cy, w, h = box.xywh[0].tolist()
                writer.writerow([
                    name,
                    round(cx, 2),
                    round(cy, 2),
                    round(w, 2),
                    round(h, 2),
                    round(box.conf.item(), 4),
                    r.names[int(box.cls.item())]
                ])

Accessing Detection Results

Each result object contains:

Attribute	Description
`r.path`	Source image path
`r.boxes`	Tensor of detected bounding boxes (None if no detections)
`r.boxes.xywh`	Bounding boxes in (x_center, y_center, width, height) format
`r.boxes.xyxy`	Bounding boxes in (x1, y1, x2, y2) format
`r.boxes.conf`	Confidence scores
`r.boxes.cls`	Class indices
`r.names`	Dictionary mapping class indices to names

GPU Optimizations

Like the training script, detection applies A100-specific optimizations:

import torch

# A100: maximize GPU throughput
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cudnn.benchmark = True

These flags enable TensorFloat-32 and auto-tuned convolutions for faster inference.

Adjusting Confidence Threshold

The confidence threshold (CONF = 0.4) controls the precision/recall tradeoff:

How to choose the right threshold

Lower threshold (e.g., 0.2): More detections, higher recall, but more false positives
Higher threshold (e.g., 0.6): Fewer detections, higher precision, but may miss some drones
Default (0.4): Balanced setting that works well for most scenarios

To experiment with different thresholds, modify the CONF variable in detect.py:12:

CONF = 0.6  # More conservative detections

Batch Size Tuning

The default batch size is 16 images. Adjust based on GPU memory:

GPU VRAM	Recommended Batch Size
4 GB	4-8
8 GB	8-16
12+ GB	16-32
40 GB (A100)	32-64

Modify batch in the predict call:

results = model.predict(
    source=str(IMAGE_DIR),
    conf=CONF,
    imgsz=IMGSZ,
    batch=32,  # Increase for larger GPUs
    half=True,
)

If you get CUDA out of memory errors during inference, reduce the batch size.

Using Detections for Beam Steering

The CSV output is designed for THz beam steering applications. Each detection provides:

Spatial coordinates: (x_center, y_center) for pointing the beam
Drone size: (width, height) for estimating distance or filtering by target size
Confidence: For filtering low-quality detections

Example workflow:

import pandas as pd

# Load detections
df = pd.read_csv("output/detections.csv")

# Filter high-confidence detections
df = df[df['confidence'] > 0.7]

# Get beam steering coordinates for each detection
for _, row in df.iterrows():
    x, y = row['x_center'], row['y_center']
    # Send (x, y) to beam steering controller
    steer_beam(x, y)

Troubleshooting

Model file not found

Problem: FileNotFoundError: runs/drone_detect/weights/best.ptSolution: Train the model first using python train.py. The detection script requires a trained checkpoint.

No detections in CSV

Possible Causes:

Confidence threshold too high (try lowering CONF from 0.4 to 0.2)
Model not trained properly (check validation mAP in training output)
Test images don’t contain drones
Wrong model checkpoint (verify you’re using best.pt, not untrained weights)

CUDA out of memory during inference

Solutions:

Reduce batch from 16 to 8 or 4
Disable half=True (slower but uses less memory)
Process images one at a time with batch=1

Slow inference on CPU

Problem: Inference takes several seconds per imageSolution: Ensure you have a CUDA-capable GPU and PyTorch with CUDA support installed. Check with:

import torch
print(torch.cuda.is_available())  # Should print True

Next Steps

Integrate detections with your THz beam steering controller
Experiment with different confidence thresholds
Run detection on live video streams using Ultralytics’ video inference mode

Get Started

Guides

Reference

Resources

Running Detection

Prerequisites