System Requirements
Hardware
Recommended
- GPU: NVIDIA A100 (40GB+ VRAM)
- RAM: 16GB+ (for
cache="ram") - Storage: 2GB for dataset + models
- OS: Linux (Ubuntu 22.04)
Minimum
- GPU: NVIDIA RTX 3050 (4GB VRAM)
- RAM: 8GB (use
cache="disk") - Storage: 2GB
- OS: Windows/Linux/macOS
Software
- Python: 3.10 or higher
- CUDA: 11.8+ (for GPU acceleration)
- PyTorch: 2.x (for
compile=Truesupport)
Installation
Install Python Dependencies
Install required packages:The
requirements.txt contains:requirements.txt
On Lightning.ai A100 instances, CUDA and PyTorch come pre-installed. On other systems, install PyTorch first:
Dataset Preparation
BeamFinder expects the DeepSense Scenario 23 dataset (or any dataset in YOLO format).Download Dataset
Obtain the DeepSense Scenario 23 drone dataset, which includes:
- 11,387 images (960×540 resolution)
- 11,387 YOLO-format annotation files (.txt)
Split Dataset
Distribute images and labels into train/validation/test splits (70/15/15):
Shuffling before splitting ensures each set has a diverse mix of the 51 capture sessions.
YOLO Annotation Format
Each.txt file contains one line per bounding box:
- class_id: 0 (for “drone”)
- x_center, y_center: Normalized center coordinates (0-1)
- width, height: Normalized box dimensions (0-1)
image_001.txt):
image_001.jpg.
Training
Configure Training Script
The
train.py script contains optimized hyperparameters for A100 GPUs:train.py
Start Training
Run the training script:Training will:
- Load YOLO26s pretrained weights
- Fine-tune on drone dataset for up to 100 epochs
- Save checkpoints every 10 epochs
- Stop early if validation mAP doesn’t improve for 20 epochs
- Run final validation on val and test sets
Hyperparameter Explanation
imgsz=960 - Why not 640?
imgsz=960 - Why not 640?
The dataset images are 960×540. Using
imgsz=960 with rect=True preserves the aspect ratio and avoids excessive padding. With square inputs (rect=False), ~44% of pixels would be wasted on black padding.batch=0.90 - Automatic batch sizing
batch=0.90 - Automatic batch sizing
Setting
batch to a fraction (0.90 = 90%) tells Ultralytics to automatically select the largest batch size that fits in GPU memory. On A100 this is typically batch 16-32, on RTX 3050 it’s batch 2-4.cache='ram' - Speed vs memory
cache='ram' - Speed vs memory
Caching the dataset in RAM eliminates disk I/O bottlenecks during training. The 11,387 images (~650MB on disk) require ~4GB RAM when cached. If you have less than 16GB system RAM, use
cache="disk" instead.compile=True - PyTorch 2.x optimization
compile=True - PyTorch 2.x optimization
Enables
torch.compile() which optimizes the model graph for 10-30% faster training on PyTorch 2.x + A100. Requires PyTorch ≥2.0.rect=True - Rectangular training
rect=True - Rectangular training
Preserves image aspect ratio during training instead of forcing square inputs. Critical for 16:9 images to avoid wasting compute on padding.
Data augmentation parameters
Data augmentation parameters
degrees=15.0: Random rotation ±15°flipud=0.5: Vertical flip with 50% probabilityscale=0.9: Random scale to 90% of original sizetranslate=0.2: Random translation ±20%
Windows-Specific Configuration
Validation & Testing
After training completes, the script automatically runs validation:- mAP50: Mean average precision at IoU=0.5 (standard COCO metric)
- mAP50-95: Mean average precision averaged over IoU thresholds 0.5-0.95 (stricter)
Troubleshooting
CUDA out of memory
CUDA out of memory
Reduce batch size or enable FP16:
Windows multiprocessing error
Windows multiprocessing error
Set
workers=0 in train.py and ensure the training code is inside if __name__ == "__main__": block.Low mAP scores
Low mAP scores
- Verify annotations are correct (YOLO format, normalized coordinates)
- Check class IDs (should be 0 for single-class)
- Ensure train/val/test splits are representative
- Try training for more epochs or reducing
patience
Training is very slow
Training is very slow
- Enable
cache="ram"(if you have enough RAM) - Use
compile=True(requires PyTorch 2.x) - Increase
workers(Linux/macOS only) - Use a GPU instead of CPU
RAM usage is too high
RAM usage is too high
Change
cache="ram" to cache="disk" in train.py. This reduces RAM usage but slows down training due to disk I/O.Next Steps
Once training completes, you’re ready to run detection:Run Detection
Use your trained model to detect drones in test images
Configuration Reference
Explore all training and detection parameters
Output Format
Learn about the CSV output structure and coordinate system
Known Issues
Common problems and their solutions
Quick Reference
Full training command:- Epoch time: ~2-3 minutes
- 100 epochs: ~3-5 hours
- Early stopping typically triggers around epoch 40-60