Overview
RealESRGANer is the primary helper class for upsampling images with Real-ESRGAN. It handles model loading, preprocessing, tiled inference for large images, and post-processing.
Class Definition
RealESRGANer
Parameters
Upsampling scale factor used in the networks. Typically 2 or 4.
Path to the pretrained model. Can be:
- Local file path
- URL (will be downloaded automatically)
- List of two paths for Deep Network Interpolation (DNI)
The defined neural network model (e.g., RRDBNet or SRVGGNetCompact).
Tile size for processing large images in chunks to avoid GPU memory issues. Set to 0 to disable tiling.
Padding size for each tile to remove border artifacts during tiled inference.
Padding size added to input images before processing to avoid border artifacts.
Whether to use half precision (FP16) during inference for faster processing and lower memory usage.
The torch device to use. If None, automatically selects CUDA if available, otherwise CPU.
Specific GPU device ID to use (e.g., 0, 1, 2 for multi-GPU systems).
Weights for Deep Network Interpolation when using two models. Must sum to 1.0 (e.g., [0.5, 0.5]).
Methods
enhance()
Main method to upscale an input image.Parameters
Input image as a numpy array (BGR format from OpenCV). Supports:
- 8-bit images (0-255)
- 16-bit images (0-65535)
- Grayscale (2D array)
- RGB (H×W×3)
- RGBA with alpha channel (H×W×4)
Final output scale. If different from the model’s scale, the output will be resized using Lanczos interpolation.
Method to upscale the alpha channel for RGBA images. Options:
'realesrgan': Use the same model'bicubic': Use OpenCV’s bicubic interpolation
Returns
Upsampled image as a numpy array with the same bit depth as input (uint8 or uint16).
Image mode: ‘L’ (grayscale), ‘RGB’, or ‘RGBA’.
pre_process()
Preprocesses the input image with padding to ensure divisibility.Input image as numpy array (RGB format, normalized to [0, 1]).
This method converts the image to a PyTorch tensor, applies pre-padding and mod-padding, and stores the result in
self.img.post_process()
Removes padding added during preprocessing and returns the final output.Returns
Processed output tensor with padding removed.
Usage Examples
Basic Upsampling
Tiled Inference for Large Images
Deep Network Interpolation (DNI)
RGBA Image with Alpha Channel
Source Reference
Implemented inrealesrgan/utils.py:14