Installation & Setup

This guide covers the full installation process including environment setup, dataset preparation, and model training.

System Requirements

Hardware

Minimum

GPU: NVIDIA RTX 3050 (4GB VRAM)
RAM: 8GB (use cache="disk")
Storage: 2GB
OS: Windows/Linux/macOS

Software

Python: 3.10 or higher
CUDA: 11.8+ (for GPU acceleration)
PyTorch: 2.x (for compile=True support)

Installation

Clone or Download BeamFinder

Get the source code:

git clone https://github.com/yourusername/beamfinder.git
cd beamfinder

Install Python Dependencies

Install required packages:

pip install -r requirements.txt

The requirements.txt contains:

requirements.txt

ultralytics>=8.4.0
matplotlib>=3.7.0

On Lightning.ai A100 instances, CUDA and PyTorch come pre-installed. On other systems, install PyTorch first:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Download Pretrained Weights

Download the YOLO26s pretrained model:

wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26s.pt

Place yolo26s.pt in the project root directory.

The COCO-pretrained model does not include a “drone” class. You must fine-tune on drone data before running detection.

Dataset Preparation

BeamFinder expects the DeepSense Scenario 23 dataset (or any dataset in YOLO format).

Download Dataset

Obtain the DeepSense Scenario 23 drone dataset, which includes:

11,387 images (960×540 resolution)
11,387 YOLO-format annotation files (.txt)

The dataset comes from 51 different capture sessions with varying conditions.

Organize Directory Structure

Create the following directory layout:

mkdir -p data/images/train
mkdir -p data/images/validation
mkdir -p data/images/test
mkdir -p data/labels/train
mkdir -p data/labels/validation
mkdir -p data/labels/test

Expected structure:

BeamFinder/
├── data/
│   ├── images/
│   │   ├── train/         # 7,970 images (70%)
│   │   ├── validation/    # 1,708 images (15%)
│   │   └── test/          # 1,709 images (15%)
│   └── labels/            # Matching .txt files
│       ├── train/
│       ├── validation/
│       └── test/

Split Dataset

Distribute images and labels into train/validation/test splits (70/15/15):

import random
from pathlib import Path
import shutil

images = list(Path("raw_data").glob("*.jpg"))
random.shuffle(images)

train_split = int(0.70 * len(images))
val_split = int(0.85 * len(images))

train_imgs = images[:train_split]
val_imgs = images[train_split:val_split]
test_imgs = images[val_split:]

for img in train_imgs:
    shutil.copy(img, "data/images/train/")
    shutil.copy(img.with_suffix(".txt"), "data/labels/train/")

# Repeat for val_imgs and test_imgs...

Shuffling before splitting ensures each set has a diverse mix of the 51 capture sessions.

Create data.yaml

Create a data.yaml configuration file in the project root:

data.yaml

path: data
train: images/train
val: images/validation
test: images/test

nc: 1
names: ["drone"]

This tells YOLO where to find images and defines the single class “drone”.

YOLO Annotation Format

Each .txt file contains one line per bounding box:

<class_id> <x_center> <y_center> <width> <height>

class_id: 0 (for “drone”)
x_center, y_center: Normalized center coordinates (0-1)
width, height: Normalized box dimensions (0-1)

Example (image_001.txt):

0 0.5104 0.4861 0.1354 0.1574
0 0.7234 0.3125 0.0987 0.1203

This represents 2 drones in image_001.jpg.

Training

Configure Training Script

The train.py script contains optimized hyperparameters for A100 GPUs:

train.py

from pathlib import Path
import torch
from ultralytics import YOLO

SCRIPT_DIR = Path(__file__).resolve().parent

if __name__ == "__main__":
    # A100 optimizations
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cudnn.allow_tf32 = True
    torch.backends.cudnn.benchmark = True

    model = YOLO("yolo26s.pt")

    model.train(
        data="data.yaml",
        epochs=100,
        imgsz=960,
        batch=0.90,              # 90% GPU memory utilization
        patience=20,             # Early stopping patience
        cache="ram",             # Cache dataset in RAM
        workers=8,               # Data loading workers
        cos_lr=True,             # Cosine learning rate scheduler
        deterministic=False,
        compile=True,            # torch.compile (10-30% speedup)
        project=str(SCRIPT_DIR / "runs"),
        name="drone_detect",
        exist_ok=True,
        rect=True,               # Rectangular training (16:9)
        save_period=10,          # Save checkpoint every 10 epochs
        # Data augmentation
        degrees=15.0,            # Random rotation ±15°
        flipud=0.5,              # Vertical flip 50% probability
        scale=0.9,               # Scale to 90%
        translate=0.2,           # Translate ±20%
    )

    # Validation metrics
    metrics = model.val(imgsz=960, half=True)
    print(f"Val  — mAP50: {metrics.box.map50:.4f}  mAP50-95: {metrics.box.map:.4f}")

    # Test metrics
    test_metrics = model.val(split="test", imgsz=960, half=True)
    print(f"Test — mAP50: {test_metrics.box.map50:.4f}  mAP50-95: {test_metrics.box.map:.4f}")

Start Training

Run the training script:

python train.py

Training will:

Load YOLO26s pretrained weights
Fine-tune on drone dataset for up to 100 epochs
Save checkpoints every 10 epochs
Stop early if validation mAP doesn’t improve for 20 epochs
Run final validation on val and test sets

Monitor Progress

Ultralytics automatically logs training metrics. Watch for:

Epoch   GPU_mem   box_loss   cls_loss   dfl_loss   mAP50   mAP50-95
1/100    12.4G     1.234      0.456      1.123     0.612    0.423
...
45/100   12.4G     0.234      0.056      0.323     0.945    0.782

Training results are saved to runs/drone_detect/:

runs/drone_detect/
├── weights/
│   ├── best.pt       # Best model (highest mAP)
│   └── last.pt       # Last epoch
├── results.png       # Training curves
├── confusion_matrix.png
└── args.yaml         # Training arguments

Hyperparameter Explanation

imgsz=960 - Why not 640?

The dataset images are 960×540. Using imgsz=960 with rect=True preserves the aspect ratio and avoids excessive padding. With square inputs (rect=False), ~44% of pixels would be wasted on black padding.

batch=0.90 - Automatic batch sizing

Setting batch to a fraction (0.90 = 90%) tells Ultralytics to automatically select the largest batch size that fits in GPU memory. On A100 this is typically batch 16-32, on RTX 3050 it’s batch 2-4.

cache='ram' - Speed vs memory

Caching the dataset in RAM eliminates disk I/O bottlenecks during training. The 11,387 images (~650MB on disk) require ~4GB RAM when cached. If you have less than 16GB system RAM, use cache="disk" instead.

compile=True - PyTorch 2.x optimization

Enables torch.compile() which optimizes the model graph for 10-30% faster training on PyTorch 2.x + A100. Requires PyTorch ≥2.0.

rect=True - Rectangular training

Preserves image aspect ratio during training instead of forcing square inputs. Critical for 16:9 images to avoid wasting compute on padding.

Data augmentation parameters

degrees=15.0: Random rotation ±15°
flipud=0.5: Vertical flip with 50% probability
scale=0.9: Random scale to 90% of original size
translate=0.2: Random translation ±20%

These augmentations improve generalization to new drone positions and orientations.

Windows-Specific Configuration

On Windows, Python multiprocessing requires special handling. Change these settings in train.py:

workers=0  # Must be 0 on Windows (single-threaded data loading)

Training will be slower, but cache="ram" compensates by eliminating disk I/O.

Validation & Testing

After training completes, the script automatically runs validation:

# Validation set
metrics = model.val(imgsz=960, half=True)
print(f"Val  — mAP50: {metrics.box.map50:.4f}  mAP50-95: {metrics.box.map:.4f}")

# Test set
test_metrics = model.val(split="test", imgsz=960, half=True)
print(f"Test — mAP50: {test_metrics.box.map50:.4f}  mAP50-95: {test_metrics.box.map:.4f}")

Key metrics:

mAP50: Mean average precision at IoU=0.5 (standard COCO metric)
mAP50-95: Mean average precision averaged over IoU thresholds 0.5-0.95 (stricter)

Troubleshooting

CUDA out of memory

Reduce batch size or enable FP16:

batch=0.70  # Use 70% of GPU memory instead of 90%
# or
batch=8     # Fixed batch size

Windows multiprocessing error

Set workers=0 in train.py and ensure the training code is inside if __name__ == "__main__": block.

Low mAP scores

Verify annotations are correct (YOLO format, normalized coordinates)
Check class IDs (should be 0 for single-class)
Ensure train/val/test splits are representative
Try training for more epochs or reducing patience

Training is very slow

Enable cache="ram" (if you have enough RAM)
Use compile=True (requires PyTorch 2.x)
Increase workers (Linux/macOS only)
Use a GPU instead of CPU

RAM usage is too high

Change cache="ram" to cache="disk" in train.py. This reduces RAM usage but slows down training due to disk I/O.

Next Steps

Once training completes, you’re ready to run detection:

Run Detection

Use your trained model to detect drones in test images

Configuration Reference

Explore all training and detection parameters

Output Format

Learn about the CSV output structure and coordinate system

Known Issues

Common problems and their solutions

Quick Reference

Full training command:

# 1. Install dependencies
pip install -r requirements.txt

# 2. Download pretrained weights
wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26s.pt

# 3. Organize dataset into data/images/{train,validation,test}

# 4. Create data.yaml

# 5. Train
python train.py

# Output: runs/drone_detect/weights/best.pt

Expected timeline (A100):

Epoch time: ~2-3 minutes
100 epochs: ~3-5 hours
Early stopping typically triggers around epoch 40-60

Get Started

Guides

Reference

Resources

Installation & Setup

System Requirements

Hardware

Recommended

Minimum

Software

Installation

Dataset Preparation

YOLO Annotation Format

Training

Hyperparameter Explanation

Windows-Specific Configuration

Validation & Testing

Troubleshooting

Next Steps

Run Detection

Configuration Reference

Output Format

Known Issues

Quick Reference

Build docs developers (and LLMs) love

Get Started

Guides

Reference

Resources

​System Requirements

​Hardware

Recommended

Minimum

​Software

​Installation

​Dataset Preparation

​YOLO Annotation Format

​Training

​Hyperparameter Explanation

​Windows-Specific Configuration

​Validation & Testing

​Troubleshooting

​Next Steps

Run Detection

Configuration Reference

Output Format

Known Issues

​Quick Reference

Build docs developers (and LLMs) love

System Requirements

Hardware

Software

Installation

Dataset Preparation

YOLO Annotation Format

Training

Hyperparameter Explanation

Windows-Specific Configuration

Validation & Testing

Troubleshooting

Next Steps

Quick Reference