Categories
Animation
Cartoon, Naruto, Dragon Ball, One Piece, Bleach, Lego minifigure, Sonic the Hedgehog, The Walt Disney Company
Flat Content
Website, Chart, Map, Logo, Text, Typography, Screencast, Illustration, Poster
Gaming
Minecraft, Call of Duty, Grand Theft Auto V, World of Warcraft, League of Legends, Battlefield, FIFA 15, RuneScape
Natural Content
Animal, Pet, Fishing, Dog, Horse, Bird, Plant, Cat, Farm, Garden, Nature, Tree, Wildlife, Chicken
Category Mapping
Classes are encoded as integer labels for training:Dataset Split
| Split | Proportion | Purpose |
|---|---|---|
| Train | 70% | Model training with augmentation |
| Validation | 20% | Hyperparameter tuning and early stopping |
| Test | 10% | Final evaluation and TTA |
Directory Structure
After preprocessing, the data directory follows this layout:processed_data.pt file is a PyTorch serialized dictionary containing the preprocessed video tensor stack, integer labels, filenames, and a per-subcategory category mapping.
Class Imbalance Handling
The dataset has unequal sample counts across categories. The training dataloader usesWeightedRandomSampler to ensure each class is seen proportionally during training.
Class weights are computed per-video in EnhancedPreExtractedFeaturesDataset._compute_class_weights() and clipped to the range [0.5, 10.0] to avoid extreme oversampling:
The
FocalLoss used during training also accepts per-class alpha weights, providing a second layer of imbalance correction at the loss level. See Optimization for details.