Skip to main content

Overview

RealESRGANer is the primary helper class for upsampling images with Real-ESRGAN. It handles model loading, preprocessing, tiled inference for large images, and post-processing.

Class Definition

RealESRGANer

from realesrgan import RealESRGANer

upsampler = RealESRGANer(
    scale=4,
    model_path='weights/RealESRGAN_x4plus.pth',
    model=model,
    tile=0,
    tile_pad=10,
    pre_pad=10,
    half=False,
    device=None,
    gpu_id=None,
    dni_weight=None
)

Parameters

scale
int
required
Upsampling scale factor used in the networks. Typically 2 or 4.
model_path
str | list[str]
required
Path to the pretrained model. Can be:
  • Local file path
  • URL (will be downloaded automatically)
  • List of two paths for Deep Network Interpolation (DNI)
model
nn.Module
required
The defined neural network model (e.g., RRDBNet or SRVGGNetCompact).
tile
int
default:"0"
Tile size for processing large images in chunks to avoid GPU memory issues. Set to 0 to disable tiling.
tile_pad
int
default:"10"
Padding size for each tile to remove border artifacts during tiled inference.
pre_pad
int
default:"10"
Padding size added to input images before processing to avoid border artifacts.
half
bool
default:"False"
Whether to use half precision (FP16) during inference for faster processing and lower memory usage.
device
torch.device
default:"None"
The torch device to use. If None, automatically selects CUDA if available, otherwise CPU.
gpu_id
int
default:"None"
Specific GPU device ID to use (e.g., 0, 1, 2 for multi-GPU systems).
dni_weight
list[float]
default:"None"
Weights for Deep Network Interpolation when using two models. Must sum to 1.0 (e.g., [0.5, 0.5]).

Methods

enhance()

Main method to upscale an input image.
output, img_mode = upsampler.enhance(
    img,
    outscale=None,
    alpha_upsampler='realesrgan'
)

Parameters

img
numpy.ndarray
required
Input image as a numpy array (BGR format from OpenCV). Supports:
  • 8-bit images (0-255)
  • 16-bit images (0-65535)
  • Grayscale (2D array)
  • RGB (H×W×3)
  • RGBA with alpha channel (H×W×4)
outscale
float
default:"None"
Final output scale. If different from the model’s scale, the output will be resized using Lanczos interpolation.
alpha_upsampler
str
default:"'realesrgan'"
Method to upscale the alpha channel for RGBA images. Options:
  • 'realesrgan': Use the same model
  • 'bicubic': Use OpenCV’s bicubic interpolation

Returns

output
numpy.ndarray
Upsampled image as a numpy array with the same bit depth as input (uint8 or uint16).
img_mode
str
Image mode: ‘L’ (grayscale), ‘RGB’, or ‘RGBA’.

pre_process()

Preprocesses the input image with padding to ensure divisibility.
upsampler.pre_process(img)
img
numpy.ndarray
required
Input image as numpy array (RGB format, normalized to [0, 1]).
This method converts the image to a PyTorch tensor, applies pre-padding and mod-padding, and stores the result in self.img.

post_process()

Removes padding added during preprocessing and returns the final output.
output_tensor = upsampler.post_process()

Returns

output
torch.Tensor
Processed output tensor with padding removed.

Usage Examples

Basic Upsampling

import cv2
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

# Define the model
model = RRDBNet(
    num_in_ch=3,
    num_out_ch=3,
    num_feat=64,
    num_block=23,
    num_grow_ch=32,
    scale=4
)

# Initialize upsampler
upsampler = RealESRGANer(
    scale=4,
    model_path='weights/RealESRGAN_x4plus.pth',
    model=model,
    tile=0,
    tile_pad=10,
    pre_pad=0,
    half=False
)

# Read and upscale image
img = cv2.imread('input.jpg', cv2.IMREAD_UNCHANGED)
output, _ = upsampler.enhance(img, outscale=4)

# Save result
cv2.imwrite('output.png', output)

Tiled Inference for Large Images

# For large images, use tiling to avoid GPU memory issues
upsampler = RealESRGANer(
    scale=4,
    model_path='weights/RealESRGAN_x4plus.pth',
    model=model,
    tile=512,        # Process in 512×512 tiles
    tile_pad=10,
    pre_pad=0,
    half=True        # Use FP16 for faster inference
)

output, _ = upsampler.enhance(img, outscale=4)

Deep Network Interpolation (DNI)

# Blend two models for custom denoise strength
upsampler = RealESRGANer(
    scale=4,
    model_path=[
        'weights/realesr-general-x4v3.pth',
        'weights/realesr-general-wdn-x4v3.pth'
    ],
    dni_weight=[0.5, 0.5],  # 50% each model
    model=model,
    tile=0,
    tile_pad=10,
    pre_pad=0,
    half=False
)

output, _ = upsampler.enhance(img)

RGBA Image with Alpha Channel

# Read RGBA image with transparency
img = cv2.imread('input.png', cv2.IMREAD_UNCHANGED)

# Upscale with Real-ESRGAN for alpha channel
output, img_mode = upsampler.enhance(
    img,
    outscale=4,
    alpha_upsampler='realesrgan'
)

print(f"Image mode: {img_mode}")  # 'RGBA'
cv2.imwrite('output.png', output)

Source Reference

Implemented in realesrgan/utils.py:14

Build docs developers (and LLMs) love