Download Datasets
Only high-resolution (HR) images are required. Download the following datasets:DIV2K
High-quality diverse imagesDownload DIV2K_train_HR.zip
Flickr2K
Flickr image collectionDownload Flickr2K.tar
OST
OpenImages subsetDownload OST_dataset.zip
Extract all datasets to your
datasets/ directory. Combine DIV2K and Flickr2K into a DF2K folder.Preparation Steps
The dataset preparation involves three steps. Steps 1 and 2 are optional but recommended for optimal training performance.Step 1: Generate Multi-scale Images (Optional)
For the DF2K dataset, use a multi-scale strategy to create ground-truth images at different scales. This improves the model’s ability to handle various image sizes.This step can be omitted for a quick trial, but is recommended for full training.
How multi-scale generation works
How multi-scale generation works
The
generate_multiscale_DF2K.py script:- Downsamples HR images to 75%, 50%, and 33% of original size
- Creates a version with shortest edge = 400 pixels
- Uses LANCZOS resampling for high quality
- Saves images with suffixes like
T0,T1,T2,T3
Step 2: Crop to Sub-images (Optional)
Crop images into sub-images for faster IO and processing during training. This is especially helpful when training on slower storage.Skip this step if you have fast IO or limited disk space.
Parameters
Input folder containing images to crop
Output folder for cropped sub-images
Size of each cropped sub-image (e.g., 400x400 pixels)
Step size for overlapped sliding window. Smaller values create more sub-images with more overlap.
Minimum size threshold. Patches smaller than this are dropped.
Number of parallel threads for processing
Understanding crop_size and step
Understanding crop_size and step
- crop_size: The dimension of each sub-image patch (e.g., 400 means 400x400)
- step: How much the sliding window moves. A step of 200 with crop_size 400 means 50% overlap
- More overlap = more training data but requires more disk space
- Creates 5x5 = 25 overlapping sub-images
- Each sub-image is 400x400 pixels
Step 3: Generate Meta Info File (Required)
Create a text file containing all image paths. This file tells the training script which images to use.Parameters
List of input folders containing images. Can specify multiple folders separated by spaces.
Corresponding root paths for each input folder. Must have same length as
--input.Output path for the generated meta info text file
Read each image to verify it can be loaded without errors
Meta Info Format
The generated file contains relative paths, one per line:Combining Multiple Datasets
You can merge multiple folders into a single meta info file:Validation Dataset (Optional)
If you want to validate during training, prepare a separate validation dataset with paired low-quality and high-quality images.Setting up validation
Setting up validation
- Prepare validation image pairs:
- Ground-truth (GT) folder: High-resolution images
- Low-quality (LQ) folder: Corresponding low-resolution images
- Uncomment validation sections in your training config:
- Configure validation frequency and metrics:
Directory Structure
After preparation, your directory structure should look like:Next Steps
Train Real-ESRNet
Begin stage 1 training with your prepared dataset