Skip to main content
BeamFinder uses the DeepSense Scenario 23 dataset, which contains 11,387 annotated images of drones captured in various conditions. This guide explains the dataset structure, format, and preparation steps.

Dataset Overview

The DeepSense Scenario 23 dataset is specifically designed for drone detection in THz communication scenarios.

Dataset Statistics

MetricValue
Total images11,387
Training set7,970 (70%)
Validation set1,708 (15%)
Test set1,709 (15%)
Image resolution960×540 pixels (16:9 aspect ratio)
Dataset size~650 MB
Annotation formatYOLO (normalized bounding boxes)
Classes1 (drone)
Capture sessions51 different conditions
The dataset includes images from 51 capture sessions with uneven distribution (some sessions have 1 image, others 1000+). The train/val/test split is shuffled to ensure a reasonable mix of conditions across all splits.

Directory Structure

The dataset must be organized in the standard YOLO format:
BeamFinder/
├── data.yaml              # Dataset configuration
├── data/
│   ├── images/
│   │   ├── train/         # 7,970 training images
│   │   │   ├── image_BS1_0001_16_58_46.jpg
│   │   │   ├── image_BS1_0002_16_58_46.jpg
│   │   │   └── ...
│   │   ├── validation/    # 1,708 validation images
│   │   │   ├── image_BS1_0005_16_58_47.jpg
│   │   │   └── ...
│   │   └── test/          # 1,709 test images
│   │       ├── image_BS1_0003_16_58_46.jpg
│   │       └── ...
│   └── labels/
│       ├── train/         # 7,970 label files
│       │   ├── image_BS1_0001_16_58_46.txt
│       │   ├── image_BS1_0002_16_58_46.txt
│       │   └── ...
│       ├── validation/    # 1,708 label files
│       │   ├── image_BS1_0005_16_58_47.txt
│       │   └── ...
│       └── test/          # 1,709 label files
│           ├── image_BS1_0003_16_58_46.txt
│           └── ...
Each image must have a corresponding .txt file with the same name in the labels directory. The pairing is critical for YOLO training.

Dataset Configuration File

The data.yaml file defines dataset paths and classes:
path: data
train: images/train
val: images/validation
test: images/test

nc: 1
names: ["drone"]

Configuration Fields

FieldDescription
pathRoot directory for dataset (relative to project root)
trainTraining images subdirectory (relative to path)
valValidation images subdirectory
testTest images subdirectory
ncNumber of classes (1 for single-class drone detection)
namesList of class names (index 0 = “drone”)
The path field is relative to the project root where train.py is located. Ultralytics automatically resolves the full paths during training.

Annotation Format

Each .txt label file contains bounding box annotations in YOLO format:
class_id x_center y_center width height
All coordinates are normalized (0-1 range relative to image dimensions).

Example Annotation

0 0.541511 0.609623 0.046874 0.075995
This represents:
  • Class: 0 (drone)
  • Center: (0.541511, 0.609623) → 51.95% from left, 60.96% from top
  • Size: 0.046874 × 0.075995 → 4.69% of image width, 7.60% of image height

Converting to Pixel Coordinates

For 960×540 images:
1

Denormalize coordinates

image_width = 960
image_height = 540

x_center_px = 0.541511 * 960 = 519.85
y_center_px = 0.609623 * 540 = 329.19
width_px = 0.046874 * 960 = 45.0
height_px = 0.075995 * 540 = 41.04
2

Convert to corner coordinates (if needed)

x1 = x_center_px - width_px / 2 = 497.35
y1 = y_center_px - height_px / 2 = 308.67
x2 = x_center_px + width_px / 2 = 542.35
y2 = y_center_px + height_px / 2 = 349.71

Multi-Object Images

If an image contains multiple drones, each gets its own line:
image_with_2_drones.txt
0 0.541511 0.609623 0.046874 0.075995
0 0.723456 0.412345 0.051234 0.082341

Obtaining the Dataset

The DeepSense Scenario 23 dataset is publicly available:
1

Download from DeepSense

Visit the DeepSense 6G website and download the Scenario 23 dataset. The download includes both images and YOLO-format annotation files.
2

Extract the dataset

Extract the downloaded archive. You should find:
  • A directory of images (.jpg files)
  • A directory of labels (.txt files)
3

Organize into train/val/test splits

The raw dataset needs to be split into training, validation, and test sets. Use a 70/15/15 split with shuffling to ensure diverse conditions in each set.
from pathlib import Path
import shutil
import random

# Set seed for reproducibility
random.seed(42)

# Get all image files
images = list(Path("raw_images").glob("*.jpg"))
random.shuffle(images)

# Calculate split indices
total = len(images)
train_split = int(0.70 * total)
val_split = int(0.85 * total)

train_images = images[:train_split]
val_images = images[train_split:val_split]
test_images = images[val_split:]

# Create directories
for split in ["train", "validation", "test"]:
    Path(f"data/images/{split}").mkdir(parents=True, exist_ok=True)
    Path(f"data/labels/{split}").mkdir(parents=True, exist_ok=True)

# Copy files
def copy_split(image_list, split_name):
    for img_path in image_list:
        # Copy image
        shutil.copy(img_path, f"data/images/{split_name}/{img_path.name}")
        # Copy corresponding label
        label_path = Path("raw_labels") / f"{img_path.stem}.txt"
        if label_path.exists():
            shutil.copy(label_path, f"data/labels/{split_name}/{img_path.stem}.txt")

copy_split(train_images, "train")
copy_split(val_images, "validation")
copy_split(test_images, "test")

print(f"Split complete: {len(train_images)} train, {len(val_images)} val, {len(test_images)} test")
4

Verify the structure

Ensure your directory matches the structure shown above, with matching image/label pairs in each split.

Verifying Dataset Integrity

1

Check file counts

Verify that each split has matching image and label counts:
# Count images
ls data/images/train/*.jpg | wc -l
ls data/images/validation/*.jpg | wc -l
ls data/images/test/*.jpg | wc -l

# Count labels
ls data/labels/train/*.txt | wc -l
ls data/labels/validation/*.txt | wc -l
ls data/labels/test/*.txt | wc -l
Expected: 7970 train, 1708 validation, 1709 test
2

Check image/label pairing

Ensure every image has a corresponding label file:
from pathlib import Path

for split in ["train", "validation", "test"]:
    images = set(p.stem for p in Path(f"data/images/{split}").glob("*.jpg"))
    labels = set(p.stem for p in Path(f"data/labels/{split}").glob("*.txt"))
    
    missing_labels = images - labels
    extra_labels = labels - images
    
    print(f"{split}: {len(images)} images, {len(labels)} labels")
    if missing_labels:
        print(f"  WARNING: {len(missing_labels)} images missing labels")
    if extra_labels:
        print(f"  WARNING: {len(extra_labels)} labels without images")
3

Validate annotation format

Check that annotations are properly formatted:
import random
from pathlib import Path

# Check a random sample
label_files = list(Path("data/labels/train").glob("*.txt"))
sample = random.sample(label_files, 10)

for label_file in sample:
    with open(label_file) as f:
        for line in f:
            parts = line.strip().split()
            assert len(parts) == 5, f"Invalid format in {label_file}: {line}"
            
            cls, x, y, w, h = map(float, parts)
            assert cls == 0, f"Invalid class {cls} (expected 0)"
            assert 0 <= x <= 1, f"x_center {x} out of range [0, 1]"
            assert 0 <= y <= 1, f"y_center {y} out of range [0, 1]"
            assert 0 <= w <= 1, f"width {w} out of range [0, 1]"
            assert 0 <= h <= 1, f"height {h} out of range [0, 1]"

print("Annotation format validation passed!")

Image Characteristics

Aspect Ratio

All images are 960×540 pixels (16:9 aspect ratio). This matches standard HD video format and is ideal for drone detection in wide outdoor scenes.
The training script uses rect=True to preserve the 16:9 aspect ratio during training. Without this, YOLO would pad images to square (960×960), wasting 44% of pixels on black padding.

Capture Conditions

The dataset includes diverse conditions:
  • Different times of day (various lighting conditions)
  • Multiple drone positions and angles
  • Varying backgrounds (sky, buildings, trees)
  • Different drone sizes (near/far from camera)
This diversity helps the model generalize to real-world scenarios.

Memory Requirements

When training with cache="ram", the dataset is loaded into system memory:
ResourceRequired
Disk space~650 MB (images only)
System RAM (with cache=“ram”)~4 GB
GPU VRAM (training, batch=0.90)~3.5 GB (varies by model)
If your system has less than 16GB RAM, change cache="ram" to cache="disk" in train.py:22 to avoid memory exhaustion.

Known Issues

Issue: Initially thought the dataset didn’t include bbox labels.Resolution: The 11,387 YOLO-format .txt files are included in the original DeepSense download. They just need to be paired with images and organized into the YOLO directory structure. See the Known Issues page for more details.
Issue: The 51 capture sessions have very uneven counts (some 1 image, others 1000+).Mitigation: Shuffling before splitting ensures train/val/test sets have a reasonable mix of conditions. This hasn’t caused problems in practice. See the Known Issues page for more details.

Next Steps

Once your dataset is set up:
  1. Verify the directory structure matches the expected format
  2. Confirm data.yaml points to the correct paths
  3. Start training using the Training Guide

References

Build docs developers (and LLMs) love