Overview
Thedatasets module provides PyTorch Dataset implementations for loading and preprocessing the MVTec Anomaly Detection dataset. It includes the MVTecDataset class for efficient data loading with built-in image transformations and train/test split handling.
Classes
MVTecDataset
PyTorch Dataset implementation for the MVTec Anomaly Detection dataset.Constructor Parameters
Path to the MVTec data folder containing the dataset classes.
Name of the MVTec class to load (e.g., “bottle”, “cable”, “capsule”). If
None, the dataset iterates over all 15 available classes.Square size that loaded images are initially resized to before cropping.
Square size that resized images are center-cropped to. This is the final output dimension.
Indicates whether to use training or test split. Options:
DatasetSplit.TRAIN- Training images (only “good” samples)DatasetSplit.VAL- Validation splitDatasetSplit.TEST- Test images with anomalies and ground truth masks
Ratio for splitting training data into train/validation sets. Value of 1.0 uses all data for training.
Methods
__getitem__(idx)
Retrieves a single sample from the dataset.
Index of the sample to retrieve.
Dictionary containing:
image(torch.Tensor): Preprocessed image tensor of shape (3, imagesize, imagesize)mask(torch.Tensor): Ground truth mask for test split, zero tensor for trainingclassname(str): MVTec class nameanomaly(str): Anomaly type (e.g., “good”, “broken_large”, “scratch”)is_anomaly(int): Binary label (0 for “good”, 1 for anomalous)image_name(str): Relative path of the imageimage_path(str): Full path to the image file
__len__()
Returns the total number of samples in the dataset.
Number of images in the dataset.
get_image_data()
Internal method that scans the dataset directory and organizes image and mask paths.
Tuple of (imgpaths_per_class, data_to_iterate):
imgpaths_per_class(dict): Nested dictionary mapping classname → anomaly type → list of image pathsdata_to_iterate(list): Flattened list of tuples (classname, anomaly, image_path, mask_path)
Enums
DatasetSplit
Enum for specifying dataset split mode.Constants
Available Classes
The MVTec dataset includes 15 object categories:ImageNet Normalization
Images are normalized using ImageNet statistics:Image Transformations
The dataset applies the following preprocessing pipeline: For Images:- Resize to specified
resizedimension - Center crop to
imagesize - Convert to tensor
- Normalize with ImageNet mean and std
- Resize to specified
resizedimension - Center crop to
imagesize - Convert to tensor (no normalization)
