Skip to main content

Dataset Structure

The training pipeline expects a specific directory structure as defined in data.yaml. The dataset is organized into three splits:
train: /path/to/training/data/train/
val: /path/to/training/data/valid/
test: /path/to/training/data/test/

nc: 3
names: ['cardboard paper', 'metal', 'plastic']

Directory Layout

training/data/
├── train/
│   ├── images/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   └── labels/
│       ├── image1.txt
│       ├── image2.txt
│       └── ...
├── valid/
│   ├── images/
│   └── labels/
└── test/
    ├── images/
    └── labels/
Each split (train/val/test) must contain both images/ and labels/ subdirectories. Image filenames must match their corresponding label files.

Class Configuration

The model is trained to detect 3 classes of recyclable materials:
Class IDClass NameDescription
0cardboard paperCardboard boxes, paper products
1metalAluminum cans, tin containers
2plasticPlastic bottles, containers

Image Requirements

Supported Formats

  • Image Types: JPG, JPEG, PNG, BMP
  • Recommended Format: JPG (for storage efficiency)
  • Color Space: RGB

Image Specifications

1

Resolution

Images will be resized to 640x640 during training. Higher resolution source images (1920x1080 or higher) are recommended for better quality.
2

Aspect Ratio

Any aspect ratio is supported. The training pipeline automatically handles letterboxing to maintain proportions.
3

Quality

Use clear, well-lit images. Avoid heavily compressed or blurry images that may impact model performance.

Best Practices

  • Diversity: Include varied backgrounds, lighting conditions, and angles
  • Occlusion: Include partially occluded objects to improve robustness
  • Scale Variation: Capture objects at different distances/sizes
  • Multiple Objects: Include scenes with multiple trash items
Avoid using images with extreme blur, heavy noise, or very low resolution (below 416x416) as they may negatively impact training.

Annotation Format

YOLO format uses normalized coordinates for segmentation masks. Each annotation file is a text file with the same name as its corresponding image.

YOLO Segmentation Format

class_id x1 y1 x2 y2 x3 y3 ... xn yn
  • class_id: Integer representing the class (0, 1, or 2)
  • x, y: Normalized polygon coordinates (0.0 to 1.0)
  • Each line represents one object instance

Example Annotation

training/data/train/labels/image1.txt
2 0.123 0.456 0.234 0.567 0.345 0.678 0.456 0.789 0.567 0.890
0 0.678 0.123 0.789 0.234 0.890 0.345 0.901 0.456
This example shows:
  • One plastic object (class 2) with a 5-point polygon
  • One cardboard paper object (class 0) with a 4-point polygon
Coordinates are normalized by dividing pixel values by image width (for x) and height (for y). All values must be between 0.0 and 1.0.

Data Augmentation

YOLOv11 includes built-in augmentation strategies that are automatically applied during training:

Spatial Augmentations

  • Mosaic: Combines 4 images into one for multi-scale learning
  • Random Scaling: Varies object sizes (0.5x to 1.5x)
  • Random Rotation: Up to ±10 degrees
  • Random Flipping: Horizontal and vertical flips
  • Random Translation: Shifts images within bounds

Pixel-level Augmentations

  • HSV Augmentation: Varies hue, saturation, and value
  • Brightness/Contrast: Random adjustments
  • Noise Addition: Adds subtle Gaussian noise

Advanced Techniques

  • MixUp: Blends two images and their labels
  • CutOut: Randomly masks rectangular regions
  • Perspective Transforms: Simulates different camera angles
Augmentation is only applied to the training set. Validation and test sets use original images for accurate performance evaluation.

Train/Val/Test Splits

SplitPercentagePurpose
Train70-80%Model learning and weight updates
Validation10-15%Hyperparameter tuning and early stopping
Test10-15%Final model evaluation

Split Guidelines

1

Random Distribution

Randomly distribute images across splits to ensure balanced representation.
2

Class Balance

Maintain similar class distributions in each split to prevent bias.
3

No Overlap

Ensure no image appears in multiple splits to prevent data leakage.
4

Minimum Samples

Each split should have at least 100 images per class for reliable results.

Dataset Validation Checklist

Before starting training, verify:
  • All images have corresponding annotation files
  • Annotation files use correct class IDs (0, 1, 2)
  • Coordinates are normalized (0.0 to 1.0)
  • Directory structure matches data.yaml paths
  • No corrupt or unreadable images
  • Balanced class distribution across splits
  • Minimum 300-500 images per class for good results
YOLOv11 will report dataset statistics at the start of training, including class distribution and any issues with annotations.

Next Steps

Once your dataset is prepared and validated:

Start Training

Configure and run the training script with your prepared dataset

Build docs developers (and LLMs) love