Video Models - Real-ESRGAN

Real-ESRGAN provides specialized compact models for video upscaling. These models are designed to be lightweight and fast while maintaining good quality across video frames.

Available Video Models

realesr-animevideov3

Optimized for anime videos with XS size

realesr-general-x4v3

Compact model for general video content

realesr-animevideov3

This model is specifically optimized for anime videos with extra-small (XS) size for efficient video processing.

Model Specifications

Property	Value
Scale	1x, 2x, 3x, or 4x (variable)
Architecture	SRVGGNetCompact
Convolution Layers	16
Features	64
Size	XS (extra small)
Activation	PReLU
Download	realesr-animevideov3.pth

Key Features

Temporal Consistency

Designed to maintain consistency across video frames

Lightweight

XS size enables fast processing of video sequences

Variable Scale

Supports 1x, 2x, 3x, and 4x upscaling

Low Memory

Consumes minimal GPU memory for longer videos

realesr-general-x4v3

Model Specifications

Property	Value
Scale	1x, 2x, 3x, or 4x (variable)
Architecture	SRVGGNetCompact
Convolution Layers	32
Features	64
Size	S (small)
Activation	PReLU
Download	realesr-general-x4v3.pth

This model can be used for both general images and videos. It’s slightly larger than the anime video model but still very efficient.

Usage

PyTorch Inference

# Download the anime video model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3.pth -P weights

# Single GPU, single process
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i inputs/video/onepiece_demo.mp4 \
  -n realesr-animevideov3 \
  -s 2 \
  --suffix outx2

# Single GPU, multiple processes (better GPU utilization)
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i inputs/video/onepiece_demo.mp4 \
  -n realesr-animevideov3 \
  -s 2 \
  --suffix outx2 \
  --num_process_per_gpu 2

# Multi-GPU processing
CUDA_VISIBLE_DEVICES=0,1,2,3 python inference_realesrgan_video.py \
  -i inputs/video/onepiece_demo.mp4 \
  -n realesr-animevideov3 \
  -s 2 \
  --suffix outx2 \
  --num_process_per_gpu 2

Command Line Options

--num_process_per_gpu

int

Number of processes per GPU. Total processes = num_gpu × num_process_per_gpu. Helps with GPU utilization since video processing is often IO-bound.

--extract_frame_first

boolean

Extract all frames before processing. Enable this if you encounter ffmpeg errors with multi-processing.

-s, --outscale

float

Output scale: 1, 2, 3, or 4. Both video models support variable scaling.

-i, --input

string

Input video file path.

-n, --model_name

string

Model name: realesr-animevideov3 or realesr-general-x4v3.

NCNN Executable (Manual Workflow)

For systems without Python or CUDA, use the NCNN portable executable with a manual frame extraction workflow.

Extract Frames from Video

Use ffmpeg to extract frames:

# Create output directory
mkdir tmp_frames

# Extract frames with high quality
ffmpeg -i onepiece_demo.mp4 -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 tmp_frames/frame%08d.png

Download NCNN Executable

Download for your platform:

Process Frames

# Create output directory
mkdir out_frames

# Process all frames
./realesrgan-ncnn-vulkan.exe -i tmp_frames -o out_frames -n realesr-animevideov3 -s 2 -f jpg

Get Original FPS

Check the original video’s FPS:

ffmpeg -i onepiece_demo.mp4

Look for the fps value in the output (e.g., “23.98 fps”).

Merge Frames Back to Video

Merge enhanced frames into video:

# Video only
ffmpeg -r 23.98 -i out_frames/frame%08d.jpg \
  -c:v libx264 -r 23.98 -pix_fmt yuv420p output.mp4

# Video with audio from original
ffmpeg -r 23.98 -i out_frames/frame%08d.jpg -i onepiece_demo.mp4 \
  -map 0:v:0 -map 1:a:0 \
  -c:a copy -c:v libx264 \
  -r 23.98 -pix_fmt yuv420p \
  output_w_audio.mp4

FFmpeg Options Explained

Frame Extraction Options

-qscale:v 1    # Quality scale (1 = highest quality)
-qmin 1        # Minimum quality
-qmax 1        # Maximum quality
-vsync 0       # Disable frame sync (extract all frames)

Video Encoding Options

-r 23.98       # Frame rate (match original)
-c:v libx264   # Video codec (H.264)
-pix_fmt yuv420p  # Pixel format (widely compatible)

Audio Handling

-map 0:v:0     # Map first video stream
-map 1:a:0     # Map first audio stream from second input
-c:a copy      # Copy audio without re-encoding

Model Comparison

Anime Video vs General Video

Feature	realesr-animevideov3	realesr-general-x4v3
Size	XS	S
Conv Layers	16	32
Speed	Fastest	Fast
Best For	Anime videos	General videos
Memory	Minimal	Low
Quality	Good for anime	Good for general content

Video Models vs Image Models

Aspect	Video Models	Image Models
Architecture	SRVGGNetCompact	RRDBNet
Size	XS/S	Large
Speed	Very fast	Slower
Memory	Low	Higher
Quality	Good	Best
Use Case	Video sequences	Single images

Video models trade some quality for speed and efficiency, making them practical for processing thousands of video frames.

Performance Optimization

Multi-Processing for Better GPU Utilization

Why Multi-Processing?

Video processing is often IO-bound (reading/writing frames), leaving GPUs underutilized. Multi-processing helps maximize GPU usage:

# Single process (GPU may be idle during IO)
python inference_realesrgan_video.py -i video.mp4 -n realesr-animevideov3 -s 2

# 2 processes per GPU (better utilization)
python inference_realesrgan_video.py -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 2

Monitor GPU memory and adjust num_process_per_gpu accordingly.

Handling FFmpeg Errors

Extract Frames First

If you encounter ffmpeg errors with multi-processing:

python inference_realesrgan_video.py \
  -i video.mp4 \
  -n realesr-animevideov3 \
  -s 2 \
  --extract_frame_first

This extracts all frames before processing, avoiding concurrent ffmpeg access issues.

Best Practices

Choose Right Scale

Start with 2x for faster processing, use 4x only if needed:

-s 2  # 2x upscaling (recommended)
-s 4  # 4x upscaling (slower)

Optimize GPU Usage

Use multi-processing for better GPU utilization:

--num_process_per_gpu 2

Preserve Audio

Always include audio when merging frames back:

-map 0:v:0 -map 1:a:0 -c:a copy

Match FPS

Use original video’s FPS for smooth playback:

-r 23.98  # Match original

Example Workflows

Quick 2x Upscaling (Anime)

# Download model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3.pth -P weights

# Process video
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i input.mp4 \
  -n realesr-animevideov3 \
  -s 2 \
  --suffix _2x \
  --num_process_per_gpu 2

High Quality 4x Upscaling (General)

# Download model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-general-x4v3.pth -P weights

# Process video
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i input.mp4 \
  -n realesr-general-x4v3 \
  -s 4 \
  --suffix _4x

Batch Processing Multiple Videos

# Process all MP4 files in a directory
for video in input_videos/*.mp4; do
  python inference_realesrgan_video.py \
    -i "$video" \
    -n realesr-animevideov3 \
    -s 2 \
    --suffix _enhanced
done

Get Started

Core Concepts

Usage Guides

Training

Models

Resources

​Available Video Models

realesr-animevideov3

realesr-general-x4v3

​realesr-animevideov3

​Model Specifications

​Key Features

Temporal Consistency

Lightweight

Variable Scale

Low Memory

​realesr-general-x4v3

​Model Specifications

​Usage

​PyTorch Inference

​Command Line Options

​NCNN Executable (Manual Workflow)

​FFmpeg Options Explained

​Model Comparison

​Anime Video vs General Video

​Video Models vs Image Models

​Performance Optimization

​Multi-Processing for Better GPU Utilization

​Handling FFmpeg Errors

​Best Practices

Choose Right Scale

Optimize GPU Usage

Preserve Audio

Match FPS

​Example Workflows

​Quick 2x Upscaling (Anime)

​High Quality 4x Upscaling (General)

​Batch Processing Multiple Videos

​Next Steps

Anime Models

General Models

Build docs developers (and LLMs) love

Available Video Models

realesr-animevideov3

Model Specifications

Key Features

realesr-general-x4v3

Model Specifications

Usage

PyTorch Inference

Command Line Options

NCNN Executable (Manual Workflow)

FFmpeg Options Explained

Model Comparison

Anime Video vs General Video

Video Models vs Image Models

Performance Optimization

Multi-Processing for Better GPU Utilization

Handling FFmpeg Errors

Best Practices

Example Workflows

Quick 2x Upscaling (Anime)

High Quality 4x Upscaling (General)

Batch Processing Multiple Videos

Next Steps