Skip to main content

Overview

BeamFinder uses three main configuration points:
  1. Training configuration - parameters for model fine-tuning in train.py
  2. Detection configuration - parameters for inference in detect.py
  3. Dataset configuration - data paths and class definitions in data.yaml

Training Configuration

The training pipeline (train.py) uses Ultralytics YOLO with optimized parameters for A100 GPU training.

Model Selection

model
string
default:"yolo26s.pt"
Base YOLO26 model to fine-tune. Available variants: yolo26n.pt, yolo26s.pt, yolo26m.pt, yolo26l.pt, yolo26x.pt

Training Arguments

All training arguments are passed to model.train() in train.py:16-36:
data
string
default:"data.yaml"
Path to dataset configuration file. Defines train/val/test splits and class names.
epochs
integer
default:"100"
Number of training epochs. The model trains for 100 epochs with early stopping via patience.
imgsz
integer
default:"960"
Input image size (square). All images are resized to 960×960 pixels during training and inference.
batch
float
default:"0.90"
Batch size as fraction of GPU memory (0.90 = 90% utilization). Automatically scales based on available VRAM.
patience
integer
default:"20"
Early stopping patience. Training stops if validation mAP doesn’t improve for 20 consecutive epochs.
cache
string
default:"ram"
Dataset caching strategy. Options: "ram" (cache in memory), "disk" (cache on SSD), or False (no caching).
workers
integer
default:"8"
Number of data loader workers for parallel image preprocessing.
cos_lr
boolean
default:"true"
Use cosine learning rate scheduler instead of linear decay.
deterministic
boolean
default:"false"
Enable deterministic mode for reproducible results. Set to false for faster training with cudnn.benchmark.
compile
boolean
default:"true"
Enable torch.compile() for 10-30% faster training on A100 + PyTorch 2.x.
rect
boolean
default:"true"
Rectangular training - batches images by aspect ratio to reduce padding and increase efficiency.
save_period
integer
default:"10"
Save checkpoint every N epochs (in addition to best.pt and last.pt).

Output Paths

project
string
default:"runs"
Root directory for training outputs. Resolved to absolute path via SCRIPT_DIR / "runs".
name
string
default:"drone_detect"
Experiment name. Results saved to {project}/{name}/.
exist_ok
boolean
default:"true"
Allow overwriting existing experiment directory.

Data Augmentation

BeamFinder applies geometric augmentations during training:
degrees
float
default:"15.0"
Random rotation range in degrees (±15°).
flipud
float
default:"0.5"
Probability of vertical flip (50%).
scale
float
default:"0.9"
Random scale range. 0.9 allows scaling from 90% to 110% of original size.
translate
float
default:"0.2"
Random translation as fraction of image size (±20%).

GPU Optimizations

The training script enables TF32 and cudnn benchmark for A100 GPUs (train.py:10-12):
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cudnn.benchmark = True
These optimizations provide ~30% speedup on A100 but may reduce numerical precision slightly.

Validation Arguments

After training completes, BeamFinder runs validation and test evaluation (train.py:38-42):
imgsz
integer
default:"960"
Validation image size (must match training).
half
boolean
default:"true"
Use FP16 precision for faster validation.
split
string
default:"val"
Dataset split to evaluate. Options: "val" or "test".

Detection Configuration

The detection script (detect.py) loads a trained model and runs inference on test images.

Model & Paths

Configuration variables defined at detect.py:8-14:
MODEL
string
default:"runs/drone_detect/weights/best.pt"
Path to trained model checkpoint. Points to best weights from training.
IMAGE_DIR
Path
default:"data/images/test"
Directory containing images to process.
OUTPUT_DIR
Path
default:"output"
Directory for detection results (CSV + annotated images).
CONF
float
default:"0.4"
Confidence threshold for detections. Only boxes with confidence ≥ 0.4 are saved.
IMGSZ
integer
default:"960"
Inference image size (must match training).

Prediction Arguments

Passed to model.predict() at detect.py:34-38:
source
string
required
Path to image directory or single image file.
conf
float
default:"0.4"
Minimum confidence threshold (0-1). Lower values increase recall but add false positives.
imgsz
integer
default:"960"
Input image size for inference.
save
boolean
default:"true"
Save annotated images with bounding boxes drawn.
project
string
default:"output"
Root directory for saving annotated images.
name
string
default:"annotated"
Subdirectory name for annotated images.
exist_ok
boolean
default:"true"
Overwrite existing output directory.
half
boolean
default:"true"
Use FP16 precision for faster inference.
batch
integer
default:"16"
Batch size for inference. Process 16 images at a time.

CSV Output

Detection results are written to output/detections.csv with 7 columns (see Output Format for details).

Dataset Configuration

The data.yaml file defines dataset paths and class labels for YOLO training.

Structure

path: data
train: images/train
val: images/validation
test: images/test

nc: 1
names: ["drone"]

Fields

path
string
default:"data"
Root directory for dataset. All other paths are relative to this.
train
string
default:"images/train"
Training images directory (7,970 images).
val
string
default:"images/validation"
Validation images directory (1,708 images).
test
string
default:"images/test"
Test images directory (1,709 images).
nc
integer
default:"1"
Number of classes. BeamFinder detects a single class (drone).
names
array
default:"[\"drone\"]"
Class names list. Must have length equal to nc.
Each image in images/train, images/validation, and images/test must have a corresponding label file in labels/train, labels/validation, and labels/test with the same filename but .txt extension.

Label Format

YOLO format: one bounding box per line
<class_id> <x_center> <y_center> <width> <height>
All coordinates are normalized to [0, 1] relative to image dimensions. Example:
0 0.5234 0.6891 0.1245 0.0934
  • class_id: 0 (drone is the only class)
  • x_center, y_center: center of bounding box
  • width, height: box dimensions

Environment Setup

PyTorch Backend Configuration

Both train.py and detect.py configure PyTorch for optimal A100 performance:
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cudnn.benchmark = True
cudnn.benchmark=True is optimized for fixed input sizes. If you change imgsz, restart training to re-tune convolution algorithms.

Hardware Requirements

  • GPU: NVIDIA A100 40GB recommended (training uses ~36GB VRAM with batch=0.90)
  • CPU: 8+ cores for workers=8 data loading
  • RAM: 32GB+ for cache="ram" (caches full dataset in memory)
  • Storage: 10GB+ for dataset + model checkpoints

Python Dependencies

From requirements.txt:
ultralytics
The Ultralytics package includes PyTorch, CUDA libraries, and all YOLO dependencies.

Build docs developers (and LLMs) love