Skip to main content

Overview

Real-ESRGAN provides the realesr-animevideov3 model specifically optimized for anime video super-resolution. This lightweight model (XS size) is designed to process video frames efficiently while maintaining temporal consistency.

Quick Start

1

Download Model

wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3.pth -P weights
2

Run Video Inference

python inference_realesrgan_video.py -i input_video.mp4 -n realesr-animevideov3 -s 2
3

Check Output

The enhanced video will be saved as input_video_out.mp4 in the results folder.

Model Specifications

realesr-animevideov3

PropertyValue
ArchitectureSRVGGNetCompact
SizeXS (~8MB)
Conv Layers16
Upscale Factor4x (supports 1x, 2x, 3x, 4x)
Best ForAnime videos, animation
SpeedFast (optimized for video)
Download URL:
https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3.pth

Video Inference Script

The inference_realesrgan_video.py script is specifically designed for video processing with additional features:

Basic Usage

python inference_realesrgan_video.py -i input.mp4 -n realesr-animevideov3 -s 2

Multi-GPU and Multi-Processing

For faster processing, use multiple GPUs and processes:
# Improve GPU utilization with multiple processes
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i input.mp4 -n realesr-animevideov3 -s 2 \
  --num_process_per_gpu 2
The total number of processes = number of GPUs × num_process_per_gpuMulti-processing helps improve GPU utilization as video processing is often bottlenecked by I/O operations.

Command-Line Arguments

Video-Specific Options

-i, --input
string
required
Input video file, image, or folder of frames
-n, --model_name
string
default:"realesr-animevideov3"
Model to use. Options:
  • realesr-animevideov3 (recommended for anime videos)
  • RealESRGAN_x4plus_anime_6B
  • RealESRGAN_x4plus
  • Other image models
-o, --output
string
default:"results"
Output folder for the enhanced video
-s, --outscale
float
default:"4"
Final upsampling scale (1, 2, 3, or 4 recommended)
--suffix
string
default:"out"
Suffix for output video filename
--fps
float
default:"None"
FPS of output video. If not specified, uses the input video’s FPS.
--ffmpeg_bin
string
default:"ffmpeg"
Path to ffmpeg binary (use if ffmpeg is not in PATH)

Performance Options

--num_process_per_gpu
integer
default:"1"
Number of processes per GPUIncrease this to improve GPU utilization. The program is often I/O bound, so GPUs are not fully utilized with a single process.
--extract_frame_first
flag
Extract all frames first before processingUse this if you encounter ffmpeg errors during multi-processing.
-t, --tile
integer
default:"0"
Tile size for processing. Use if you encounter CUDA out of memory errors.
--tile_pad
integer
default:"10"
Tile padding size
--pre_pad
integer
default:"0"
Pre-padding size at each border
--fp32
flag
Use FP32 precision instead of FP16

Additional Options

--face_enhance
flag
Enable GFPGAN face enhancement
Face enhancement is automatically disabled for anime models. It’s designed for realistic faces only.
-dn, --denoise_strength
float
default:"0.5"
Denoise strength (only for realesr-general-x4v3 model)

Advanced Workflows

Process video directly with automatic frame handling:
python inference_realesrgan_video.py -i anime.mp4 -n realesr-animevideov3 -s 2 --suffix outx2
This automatically:
  1. Extracts frames using ffmpeg
  2. Processes frames with Real-ESRGAN
  3. Merges frames back into video with audio

Method 2: Extract-Process-Merge Workflow

Manual control over each step:
1

Extract Frames

mkdir tmp_frames
ffmpeg -i input.mp4 -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 tmp_frames/frame%08d.png
This extracts frames with high quality (qscale:v 1) to avoid compression artifacts.
2

Process Frames

Use the standard image inference script:
python inference_realesrgan.py -n realesr-animevideov3 -i tmp_frames -o out_frames
Or use the video script with folder input:
python inference_realesrgan_video.py -i tmp_frames -n realesr-animevideov3 -s 2
3

Get Original FPS

ffmpeg -i input.mp4
Look for the fps value in the output (e.g., “23.98 fps”).
4

Merge Frames Back

# Without audio
ffmpeg -r 23.98 -i out_frames/frame%08d.png -c:v libx264 -r 23.98 -pix_fmt yuv420p output.mp4

# With audio from original
ffmpeg -r 23.98 -i out_frames/frame%08d.png -i input.mp4 \
  -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx264 \
  -r 23.98 -pix_fmt yuv420p output_with_audio.mp4

Method 3: Using Extract Frame First

Use this if you encounter ffmpeg errors with multi-processing:
python inference_realesrgan_video.py \
  -i input.mp4 -n realesr-animevideov3 -s 2 \
  --extract_frame_first --num_process_per_gpu 2
This extracts all frames first, then processes them in parallel.

NCNN Executable for Videos

For users who prefer the portable NCNN executable:
1

Download NCNN Executable

2

Extract Frames

mkdir tmp_frames
ffmpeg -i input.mp4 -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 tmp_frames/frame%08d.png
3

Process with NCNN

mkdir out_frames
./realesrgan-ncnn-vulkan.exe -i tmp_frames -o out_frames -n realesr-animevideov3 -s 2 -f jpg
4

Merge Frames

# Get FPS from original video
ffmpeg -i input.mp4

# Merge with audio
ffmpeg -r 23.98 -i out_frames/frame%08d.jpg -i input.mp4 \
  -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx264 \
  -r 23.98 -pix_fmt yuv420p output.mp4

Performance Optimization

GPU Utilization

Improve GPU UsageVideo processing is often I/O bound. Use multiple processes to keep GPUs busy:
# Single GPU - use 2-4 processes
CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
  -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 3

# Multi-GPU - use 2 processes per GPU
CUDA_VISIBLE_DEVICES=0,1 python inference_realesrgan_video.py \
  -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 2
Monitor GPU usage with nvidia-smi to find the optimal number of processes.

Memory Management

High Resolution VideosFor 4K or higher resolution videos, the output size can be extremely large:
  • 1080p → 4K (4x): Very slow I/O
  • 4K → 8K+ (4x): Extremely slow, consider using smaller scale
Recommendation:
# For 4K input, use 2x instead of 4x
python inference_realesrgan_video.py -i 4k_video.mp4 -n realesr-animevideov3 -s 2

Using Tiling

For high-resolution videos that cause CUDA out of memory errors:
python inference_realesrgan_video.py \
  -i high_res_video.mp4 -n realesr-animevideov3 -s 2 \
  --tile 400 --tile_pad 10

Tips for Best Results

Choose the Right Scale
  • 480p → 960p: Use -s 2
  • 480p → 1080p: Use -s 2.25 or -s 2.5
  • 720p → 1080p: Use -s 1.5
  • 1080p → 4K: Use -s 2 (avoid 4x for performance)
Input Video QualityBetter source quality = better results:
  • Use the highest quality source available
  • Avoid re-encoded or heavily compressed videos
  • If available, use Blu-ray rips over streaming captures
Output FormatFor best quality output:
# High quality output with custom ffmpeg settings
# (modify the script or use manual merge with custom ffmpeg parameters)
ffmpeg -r 23.98 -i out_frames/frame%08d.png -i input.mp4 \
  -map 0:v:0 -map 1:a:0 -c:a copy \
  -c:v libx264 -crf 18 -preset slow \
  -r 23.98 -pix_fmt yuv420p output.mp4
  • Lower CRF = higher quality (18 is very high quality)
  • Slower preset = better compression

Troubleshooting

Use the --extract_frame_first option:
python inference_realesrgan_video.py \
  -i video.mp4 -n realesr-animevideov3 -s 2 \
  --extract_frame_first --num_process_per_gpu 2
Solutions:
  1. Use tiling:
    python inference_realesrgan_video.py -i video.mp4 -n realesr-animevideov3 -s 2 --tile 400
    
  2. Reduce processes per GPU:
    python inference_realesrgan_video.py -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 1
    
  3. Use a smaller scale:
    python inference_realesrgan_video.py -i video.mp4 -n realesr-animevideov3 -s 2
    
For large videos (>1080p output):
  1. Use multi-processing:
    CUDA_VISIBLE_DEVICES=0 python inference_realesrgan_video.py \
      -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 3
    
  2. Consider using multiple GPUs:
    CUDA_VISIBLE_DEVICES=0,1 python inference_realesrgan_video.py \
      -i video.mp4 -n realesr-animevideov3 -s 2 --num_process_per_gpu 2
    
  3. Use smaller scale or lower resolution input
The script should automatically copy audio. If it doesn’t:
  1. Manually merge with audio:
    ffmpeg -i enhanced_video.mp4 -i original.mp4 \
      -c:v copy -c:a copy -map 0:v:0 -map 1:a:0 output.mp4
    
  2. Check if original video has audio:
    ffmpeg -i original.mp4
    

Model Comparison for Videos

ModelSizeSpeedQualityBest For
realesr-animevideov38MBFastExcellentAnime videos (recommended)
RealESRGAN_x4plus_anime_6B17MBMediumExcellentHigh-quality anime frames
RealESRGAN_x4plus64MBSlowGoodGeneral video content
For anime videos, realesr-animevideov3 is recommended due to its small size and optimization for video content.

Next Steps

Anime Images

Learn about the anime image model

NCNN Executable

Use portable executable for video frame processing

General Images

Explore models for non-anime content

Build docs developers (and LLMs) love