Training the Model

BeamFinder fine-tunes YOLO26s on the DeepSense Scenario 23 drone dataset to detect drones in THz beam steering applications. This guide covers the complete training workflow from setup to evaluation.

Prerequisites

Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

Requirements:

Python 3.10+
ultralytics >= 8.4.0
matplotlib >= 3.7.0
PyTorch (pre-installed on Lightning.ai A100)

Prepare the Dataset

Ensure your dataset follows the YOLO directory structure described in the Dataset Setup guide. The training script expects data.yaml to be present in the project root.

Download Pretrained Weights

The script automatically downloads YOLO26s pretrained weights (yolo26s.pt) from Ultralytics on first run. No manual download required.

Training Script Overview

The train.py script fine-tunes YOLO26s on the drone dataset with hyperparameters optimized for A100 GPUs.

Basic Usage

python train.py

Training runs for 100 epochs with automatic validation and test evaluation at the end.

GPU Optimizations

The script includes several A100-specific optimizations for maximum throughput:

import torch
from ultralytics import YOLO

# A100: maximize GPU throughput
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cudnn.benchmark = True  # auto-tune convolutions for fixed imgsz=960

model = YOLO("yolo26s.pt")

What do these flags do?

allow_tf32: Enables TensorFloat-32 on Ampere GPUs (A100) for faster matrix operations with minimal accuracy loss
cudnn.benchmark: Auto-tunes cuDNN convolution algorithms for your specific input size (960×540). Since image size is fixed, this provides consistent speedup
compile=True: Uses torch.compile() for 10-30% faster training on A100 + PyTorch 2.x

Training Configuration

Core Hyperparameters

The training configuration is optimized for 11,387 annotated drone images:

model.train(
    data="data.yaml",
    epochs=100,
    imgsz=960,
    batch=0.90,
    patience=20,
    cache="ram",
    workers=8,
    cos_lr=True,
    deterministic=False,
    compile=True,
    project="runs",
    name="drone_detect",
    exist_ok=True,
    rect=True,
    save_period=10,
    # Data augmentation
    degrees=15.0,
    flipud=0.5,
    scale=0.9,
    translate=0.2,
)

Parameter Reference

Parameter	Value	Description
`data`	`"data.yaml"`	Dataset configuration file
`epochs`	`100`	Number of training epochs
`imgsz`	`960`	Input image size (height). Width scales to preserve 16:9 aspect ratio
`batch`	`0.90`	Use 90% of available GPU memory for batch size (auto-calculated)
`patience`	`20`	Early stopping patience - stops if no improvement for 20 epochs
`cache`	`"ram"`	Cache dataset in RAM for faster training (requires ~4GB system memory)
`workers`	`8`	Number of dataloader workers (set to 0 on Windows due to multiprocessing issues)
`cos_lr`	`True`	Use cosine learning rate schedule
`deterministic`	`False`	Allow non-deterministic operations for speed
`compile`	`True`	Enable `torch.compile()` for A100 acceleration
`rect`	`True`	Rectangular training - preserves 16:9 aspect ratio, avoids 44% padding waste
`save_period`	`10`	Save checkpoint every 10 epochs

Memory Requirements: cache="ram" requires about 4GB of system memory for the 650MB dataset. If your machine has less than 16GB RAM, change to cache="disk" in train.py:22.

Data Augmentation

The training applies augmentations to improve generalization:

Augmentation	Value	Effect
`degrees`	`15.0`	Random rotation ±15 degrees
`flipud`	`0.5`	Vertical flip (50% probability)
`scale`	`0.9`	Random scale 0.9-1.1×
`translate`	`0.2`	Random translation ±20% of image size

Training Output

Results are saved to runs/drone_detect/ with the following structure:

runs/drone_detect/
├── weights/
│   ├── best.pt          # Best checkpoint (highest mAP@50-95)
│   ├── last.pt          # Latest checkpoint
│   ├── epoch10.pt       # Checkpoint at epoch 10
│   ├── epoch20.pt       # Checkpoint at epoch 20
│   └── ...
├── results.csv          # Training metrics per epoch
├── results.png          # Loss and mAP curves
├── confusion_matrix.png # Validation confusion matrix
└── val_batch*_pred.jpg  # Validation predictions

The best.pt checkpoint is automatically selected based on validation mAP@50-95. Use this for inference.

Evaluation

After training completes, the script automatically runs validation and test evaluation:

# Validation set evaluation
metrics = model.val(imgsz=960, half=True)
print(f"Val  — mAP50: {metrics.box.map50:.4f}  mAP50-95: {metrics.box.map:.4f}")

# Test set evaluation
test_metrics = model.val(split="test", imgsz=960, half=True)
print(f"Test — mAP50: {test_metrics.box.map50:.4f}  mAP50-95: {test_metrics.box.map:.4f}")

Example output:

Val  — mAP50: 0.9234  mAP50-95: 0.6891
Test — mAP50: 0.9187  mAP50-95: 0.6823

Metrics Explained

mAP@50 vs mAP@50-95

mAP@50: Mean Average Precision at IoU threshold 0.5. A detection counts as correct if the bounding box overlaps the ground truth by at least 50%
mAP@50-95: Average of mAP across IoU thresholds from 0.5 to 0.95 in steps of 0.05. This is stricter and penalizes loose bounding boxes

Advanced: Multi-Model Comparison Study

The study.py script trains all five YOLO26 variants (nano, small, medium, large, xlarge) and compares their performance:

python study.py

This script:

Trains all 5 models for 100 epochs each
Measures training time, accuracy, inference speed, and peak GPU memory
Generates comparison charts
Supports crash recovery (skips already-completed models on restart)

MODELS = ["yolo26n.pt", "yolo26s.pt", "yolo26m.pt", "yolo26l.pt", "yolo26x.pt"]

TRAIN_ARGS = dict(
    data="data.yaml", epochs=100, imgsz=960, batch=0.90,
    patience=20, cache="ram", workers=8, cos_lr=True,
    deterministic=False, rect=True, save_period=10,
    compile=True,
    degrees=15.0, flipud=0.5, scale=0.9, translate=0.2,
)

Results are saved to:

runs/study/results_summary.json - JSON with all metrics
runs/study/comparison_charts.png - Bar charts comparing models
runs/study/efficiency_plots.png - Scatter plots (accuracy vs size/speed/memory)

Troubleshooting

Windows multiprocessing error

Problem: RuntimeError when workers > 0 on WindowsSolution: Set workers=0 in train.py:23. The cache="ram" setting compensates for single-threaded data loading. See the Known Issues page for more details.

Out of memory during training

Problem: CUDA out of memory errorSolutions:

Reduce batch from 0.90 to 0.70 or lower
Reduce imgsz from 960 to 640
Disable cache="ram" (slower but uses no system memory)
Use a smaller model variant (e.g., yolo26n.pt instead of yolo26s.pt)

System RAM exhausted

Problem: System freezes or swapping during trainingSolution: Change cache="ram" to cache="disk" in train.py:22. This uses disk I/O instead of caching the 650MB dataset in memory.

No 'drone' class in pretrained model

Expected Behavior: YOLO26s is pretrained on COCO (80 classes: person, car, bird, etc.) without a drone class. This is why fine-tuning is required. See the Known Issues page for more details.

Next Steps

After training completes:

Run inference on test images using the Detection Guide
Inspect checkpoints in runs/drone_detect/weights/
Analyze results using the charts in runs/drone_detect/

Get Started

Guides

Reference

Resources

Prerequisites

Training Script Overview

Basic Usage

GPU Optimizations

Training Configuration

Core Hyperparameters

Parameter Reference

Data Augmentation

Training Output

Evaluation

Metrics Explained

Advanced: Multi-Model Comparison Study

Troubleshooting

Next Steps

References

Build docs developers (and LLMs) love

Get Started

Guides

Reference

Resources

​Prerequisites

​Training Script Overview

​Basic Usage

​GPU Optimizations

​Training Configuration

​Core Hyperparameters

​Parameter Reference

​Data Augmentation

​Training Output

​Evaluation

​Metrics Explained

​Advanced: Multi-Model Comparison Study

​Troubleshooting

​Next Steps

​References

Build docs developers (and LLMs) love

Prerequisites

Training Script Overview

Basic Usage

GPU Optimizations

Training Configuration

Core Hyperparameters

Parameter Reference

Data Augmentation

Training Output

Evaluation

Metrics Explained

Advanced: Multi-Model Comparison Study

Troubleshooting

Next Steps

References