Known Issues

Problems encountered while building BeamFinder, and how they were addressed. Based on real issues from the development process.

These issues are documented for transparency. Most have workarounds or have been resolved.

Active Issues

1. COCO Pretrained Model Has No Drone Class

Status: Expected behavior, resolved by fine-tuning

Problem

YOLO26s comes pretrained on the COCO dataset, which includes 80 object classes (person, car, bird, bicycle, etc.) but does not include a “drone” class. Out-of-the-box inference with the pretrained model will not detect drones.

Why This Happens

The COCO dataset was created for general object detection and doesn’t include specialized objects like drones. The pretrained weights have learned features for common objects but need fine-tuning for domain-specific detection.

Solution

This is why BeamFinder fine-tunes YOLO26s on the DeepSense Scenario 23 drone dataset. The training script (train.py) starts with pretrained COCO weights (yolo26s.pt) and adapts them to detect drones:

model = YOLO("yolo26s.pt")  # Pretrained on COCO
model.train(
    data="data.yaml",  # DeepSense drone dataset
    epochs=100,
    # ... training args
)

After fine-tuning, the model’s output layer is reconfigured for single-class detection (nc: 1 in data.yaml:6).

Impact

Before training: Model will not detect drones
After training: Model achieves high mAP on drone detection

You must run python train.py before using detect.py.

4. Windows Multiprocessing Doesn't Work

Status: Worked around with workers=0

Problem

On Windows, setting workers > 0 in Ultralytics training causes a RuntimeError from Python’s multiprocessing module:

RuntimeError: An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

Why This Happens

Windows doesn’t support the fork() system call that Unix-based systems use for multiprocessing. Python’s multiprocessing on Windows uses spawn() instead, which requires the main module to be importable.

Workaround

Both train.py and detect.py include the required fixes:

Use if __name__ == "__main__": (already present in both scripts)
Set workers=0 to disable multiprocessing:

# train.py on Windows
model.train(
    workers=0,  # Required on Windows
    cache="ram",  # Compensates for single-threaded data loading
    # ... other args
)

Impact

Training is slower on Windows due to single-threaded data loading. However, cache="ram" eliminates disk I/O as a bottleneck, which partially compensates for this.Performance difference:

Linux with workers=8: ~100% GPU utilization
Windows with workers=0, cache="ram": ~90-95% GPU utilization

On Linux/Mac, you can use workers=8 for faster data loading (already set in train.py:23 for A100 training).

6. Aspect Ratio Mismatch

Status: Mitigated with rect=True

Problem

BeamFinder images are 960×540 pixels (16:9 aspect ratio), but YOLO defaults to square inputs (e.g., 640×640). Without rectangular training, about 44% of pixels would be black padding.

Why This Happens

Traditional YOLO implementations use square inputs for simplicity and efficiency. Images are letterboxed (padded with black bars) to fit the square:

960×540 → 960×960 (square)
↑ 420 pixels of black padding top/bottom

This wastes GPU compute on processing padding instead of actual image content.

Solution

BeamFinder enables rectangular training and inference with rect=True in both scripts:

# train.py:30
model.train(
    rect=True,  # Preserve aspect ratio
    # ... other args
)

# detect.py uses rect=True by default

With rect=True, YOLO preserves the aspect ratio:

960×540 → 960×540 (no padding needed)

Impact

Without rect=True: 44% of pixels are black padding, wasted compute
With rect=True: Full image utilization, ~20% faster training/inference

Rectangular training batches images with similar aspect ratios together. This is why you’ll see validation batches with different shapes during training.

Resolved Issues

2. Finding the Bounding Box Annotations

Status: Resolved

Problem

Initially thought the DeepSense Scenario 23 dataset didn’t include bounding box labels, only images.

Resolution

The 11,387 YOLO-format annotation files (.txt files) were included in the original DeepSense download. They were paired with images and organized into the standard YOLO directory layout:

data/
├── images/
│   ├── train/         # 7,970 images
│   ├── validation/    # 1,708 images
│   └── test/          # 1,709 images
└── labels/
    ├── train/         # 7,970 .txt files
    ├── validation/    # 1,708 .txt files
    └── test/          # 1,709 .txt files

Dataset split: 70% train / 15% validation / 15% test.

Impact

No impact on users. The dataset is properly structured for YOLO training.

Noted Limitations

3. Uneven Distribution Across Capture Sessions

Status: Noted, not an issue in practice

Description

The DeepSense dataset images come from 51 different capture sessions (subfolders) with very uneven counts:

Some sessions: 1 image
Other sessions: 1,000+ images

This means some environmental conditions (lighting, background, drone position) are overrepresented in the dataset.

Mitigation

Images are shuffled before splitting into train/val/test sets, so each split has a reasonable mix of conditions:

# Dataset preparation (not in repo)
all_images = glob("raw_images/**/*.jpg")
random.shuffle(all_images)  # Shuffle before split
train, val, test = split_70_15_15(all_images)

Impact

Haven’t observed this cause problems in practice. Model generalizes well across the test set despite uneven session distribution.

If you notice the model overfitting to specific backgrounds or lighting conditions, you can add more augmentation in train.py:32-35.

5. Memory Considerations

Status: Under control with proper configuration

Dataset Size

On disk: ~650MB (11,387 images)
In RAM: ~4GB when cached (cache="ram")
VRAM: Depends on batch size and model variant

RAM Requirements

BeamFinder uses cache="ram" in train.py:22 to eliminate disk I/O as a bottleneck:

model.train(
    cache="ram",  # Needs ~4GB system RAM
    # ... other args
)

Minimum system RAM:

16GB recommended (4GB for dataset cache + 8GB for OS/apps + 4GB buffer)
8GB minimum (use cache="disk" instead)

VRAM Requirements

BeamFinder was developed on two different GPUs:

GPU	VRAM	Configuration	Batch Size
RTX 3050	4GB	`batch=0.85`, `amp=True`, `imgsz=960`	2-4
A100	40GB	`batch=0.90`, `amp=True`, `imgsz=960`	32+

The batch=0.85 parameter tells Ultralytics to automatically pick the largest batch size that fits in GPU memory.

What If You Don’t Have Enough RAM?

Use disk caching instead:

model.train(
    cache="disk",  # Slower but less RAM
    # ... other args
)

Or disable caching:

model.train(
    cache=False,  # Slowest, minimal RAM
    # ... other args
)

What If You Don’t Have Enough VRAM?

Reduce batch size:

model.train(
    batch=0.50,  # Use 50% GPU memory instead of 90%
)

Use a smaller model variant:

model = YOLO("yolo26n.pt")  # Nano instead of Small

Reduce image size:

model.train(
    imgsz=640,  # Smaller than 960
)

Reducing imgsz below 960 may reduce detection accuracy, especially for small/distant drones.

Development Notes

These issues were documented during development and serve as a reference for understanding design decisions in the codebase.

CUDA Optimizations

The training script includes A100-specific optimizations (train.py:9-12):

torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cudnn.benchmark = True

These are safe to use on any CUDA GPU but provide the most benefit on Ampere architecture (RTX 30-series, A100, etc.).

Augmentation Strategy

BeamFinder uses conservative augmentation (train.py:32-35):

degrees=15.0,    # ±15° rotation
flipud=0.5,      # 50% vertical flip
scale=0.9,       # 90-110% zoom
translate=0.2,   # ±20% translation

This was chosen because:

Drones can appear at any orientation (rotation helps)
Images are captured from ground looking up (vertical flip simulates different perspectives)
Drones appear at varying distances (scale helps)

Torch Compile

The A100 training configuration uses compile=True (train.py:26):

model.train(
    compile=True,  # 10-30% faster on A100 + PyTorch 2.x
)

Requires:

PyTorch 2.0+
CUDA GPU
Linux (doesn’t work on Windows)

On unsupported platforms, Ultralytics automatically disables it with a warning.

See the Troubleshooting page for solutions to common problems.

Get Started

Guides

Reference

Resources

Active Issues

Problem

Why This Happens

Solution

Impact

Problem

Why This Happens

Workaround

Impact

Problem

Why This Happens

Solution

Impact

Resolved Issues

Problem

Resolution

Impact

Noted Limitations

Description

Mitigation

Impact

Dataset Size

RAM Requirements

VRAM Requirements

What If You Don’t Have Enough RAM?

What If You Don’t Have Enough VRAM?

Development Notes

CUDA Optimizations

Augmentation Strategy

Torch Compile

Build docs developers (and LLMs) love

Get Started

Guides

Reference

Resources

​Active Issues

​Problem

​Why This Happens

​Solution

​Impact

​Problem

​Why This Happens

​Workaround

​Impact

​Problem

​Why This Happens

​Solution

​Impact

​Resolved Issues

​Problem

​Resolution

​Impact

​Noted Limitations

​Description

​Mitigation

​Impact

​Dataset Size

​RAM Requirements

​VRAM Requirements

​What If You Don’t Have Enough RAM?

​What If You Don’t Have Enough VRAM?

​Development Notes

​CUDA Optimizations

​Augmentation Strategy

​Torch Compile

Build docs developers (and LLMs) love

Active Issues

Problem

Why This Happens

Solution

Impact

Problem

Why This Happens

Workaround

Impact

Problem

Why This Happens

Solution

Impact

Resolved Issues

Problem

Resolution

Impact

Noted Limitations

Description

Mitigation

Impact

Dataset Size

RAM Requirements

VRAM Requirements

What If You Don’t Have Enough RAM?

What If You Don’t Have Enough VRAM?

Development Notes

CUDA Optimizations

Augmentation Strategy

Torch Compile