Overview
BeamFinder uses three main configuration points:- Training configuration - parameters for model fine-tuning in
train.py - Detection configuration - parameters for inference in
detect.py - Dataset configuration - data paths and class definitions in
data.yaml
Training Configuration
The training pipeline (train.py) uses Ultralytics YOLO with optimized parameters for A100 GPU training.
Model Selection
Base YOLO26 model to fine-tune. Available variants:
yolo26n.pt, yolo26s.pt, yolo26m.pt, yolo26l.pt, yolo26x.ptTraining Arguments
All training arguments are passed tomodel.train() in train.py:16-36:
Path to dataset configuration file. Defines train/val/test splits and class names.
Number of training epochs. The model trains for 100 epochs with early stopping via
patience.Input image size (square). All images are resized to 960×960 pixels during training and inference.
Batch size as fraction of GPU memory (0.90 = 90% utilization). Automatically scales based on available VRAM.
Early stopping patience. Training stops if validation mAP doesn’t improve for 20 consecutive epochs.
Dataset caching strategy. Options:
"ram" (cache in memory), "disk" (cache on SSD), or False (no caching).Number of data loader workers for parallel image preprocessing.
Use cosine learning rate scheduler instead of linear decay.
Enable deterministic mode for reproducible results. Set to
false for faster training with cudnn.benchmark.Enable
torch.compile() for 10-30% faster training on A100 + PyTorch 2.x.Rectangular training - batches images by aspect ratio to reduce padding and increase efficiency.
Save checkpoint every N epochs (in addition to best.pt and last.pt).
Output Paths
Root directory for training outputs. Resolved to absolute path via
SCRIPT_DIR / "runs".Experiment name. Results saved to
{project}/{name}/.Allow overwriting existing experiment directory.
Data Augmentation
BeamFinder applies geometric augmentations during training:Random rotation range in degrees (±15°).
Probability of vertical flip (50%).
Random scale range. 0.9 allows scaling from 90% to 110% of original size.
Random translation as fraction of image size (±20%).
GPU Optimizations
The training script enables TF32 and cudnn benchmark for A100 GPUs (train.py:10-12):
These optimizations provide ~30% speedup on A100 but may reduce numerical precision slightly.
Validation Arguments
After training completes, BeamFinder runs validation and test evaluation (train.py:38-42):
Validation image size (must match training).
Use FP16 precision for faster validation.
Dataset split to evaluate. Options:
"val" or "test".Detection Configuration
The detection script (detect.py) loads a trained model and runs inference on test images.
Model & Paths
Configuration variables defined atdetect.py:8-14:
Path to trained model checkpoint. Points to best weights from training.
Directory containing images to process.
Directory for detection results (CSV + annotated images).
Confidence threshold for detections. Only boxes with confidence ≥ 0.4 are saved.
Inference image size (must match training).
Prediction Arguments
Passed tomodel.predict() at detect.py:34-38:
Path to image directory or single image file.
Minimum confidence threshold (0-1). Lower values increase recall but add false positives.
Input image size for inference.
Save annotated images with bounding boxes drawn.
Root directory for saving annotated images.
Subdirectory name for annotated images.
Overwrite existing output directory.
Use FP16 precision for faster inference.
Batch size for inference. Process 16 images at a time.
CSV Output
Detection results are written tooutput/detections.csv with 7 columns (see Output Format for details).
Dataset Configuration
Thedata.yaml file defines dataset paths and class labels for YOLO training.
Structure
Fields
Root directory for dataset. All other paths are relative to this.
Training images directory (7,970 images).
Validation images directory (1,708 images).
Test images directory (1,709 images).
Number of classes. BeamFinder detects a single class (drone).
Class names list. Must have length equal to
nc.Each image in
images/train, images/validation, and images/test must have a corresponding label file in labels/train, labels/validation, and labels/test with the same filename but .txt extension.Label Format
YOLO format: one bounding box per lineclass_id: 0 (drone is the only class)x_center,y_center: center of bounding boxwidth,height: box dimensions
Environment Setup
PyTorch Backend Configuration
Bothtrain.py and detect.py configure PyTorch for optimal A100 performance:
Hardware Requirements
- GPU: NVIDIA A100 40GB recommended (training uses ~36GB VRAM with
batch=0.90) - CPU: 8+ cores for
workers=8data loading - RAM: 32GB+ for
cache="ram"(caches full dataset in memory) - Storage: 10GB+ for dataset + model checkpoints
Python Dependencies
Fromrequirements.txt: