Frame integrates Real-ESRGAN for AI-powered video upscaling, allowing you to increase video resolution while preserving and enhancing quality using machine learning models.
Overview
AI upscaling uses the Real-ESRGAN neural network to intelligently upscale video frames. Unlike traditional scaling algorithms (bicubic, lanczos), AI upscaling:
Reconstructs details rather than interpolating pixels
Reduces compression artifacts
Enhances edges and textures
Produces sharper results, especially for anime and animation
Frame uses the realesr-animevideov3 models optimized for video and animation content. These models run locally on your GPU using Vulkan (NVIDIA, AMD) or Metal (Apple Silicon).
Upscale Modes
Frame supports two upscaling modes:
config . mlUpscale = 'none' ; // Disabled (default)
config . mlUpscale = 'esrgan-2x' ; // 2x upscale
config . mlUpscale = 'esrgan-4x' ; // 4x upscale
2x Upscale (esrgan-2x)
Doubles resolution in both dimensions (4x pixel count)
Example: 1920x1080 → 3840x2160 (1080p → 4K)
Model: realesr-animevideov3-x2
Faster processing, lower VRAM usage
4x Upscale (esrgan-4x)
Quadruples resolution in both dimensions (16x pixel count)
Example: 960x540 → 3840x2160 (540p → 4K)
Model: realesr-animevideov3-x4
Slower processing, higher VRAM usage
Mode resolution (upscale.rs:136-145):
pub ( crate ) fn resolve_upscale_mode (
mode : & str ,
) -> Result <( & ' static str , & ' static str ), ConversionError > {
match mode {
"esrgan-2x" => Ok (( "2" , "realesr-animevideov3-x2" )),
"esrgan-4x" => Ok (( "4" , "realesr-animevideov3-x4" )),
_ => Err ( ConversionError :: InvalidInput (
format! ( "Invalid upscale mode: {}" , mode )
)),
}
}
How AI Upscaling Works
The AI upscaling process involves three stages:
FFmpeg extracts individual frames from the source video as PNG images:
Implementation (upscale.rs:356-373):
// Apply video filters (crop, rotate, flip) before extraction
let video_filters = build_video_filters ( & task . config, false );
if ! video_filters . is_empty () {
dec_args . push ( "-vf" . to_string ());
dec_args . push ( video_filters . join ( "," ));
}
// Force constant framerate to prevent drift
dec_args . push ( "-r" . to_string ());
dec_args . push ( fps . to_string ());
dec_args . push ( "-vsync" . to_string ());
dec_args . push ( "cfr" . to_string ());
dec_args . push (
input_frames_dir . join ( "frame_%08d.png" ) . to_string_lossy () . to_string (),
);
2. AI Upscaling (Real-ESRGAN)
The realesrgan-ncnn-vulkan sidecar processes frames in batches:
Implementation (upscale.rs:456-480):
let upscaler_args = vec! [
"-v" . to_string (),
"-i" . to_string (),
sanitize_external_tool_path ( & input_frames_dir ),
"-o" . to_string (),
sanitize_external_tool_path ( & output_frames_dir ),
"-s" . to_string (),
scale . to_string (), // "2" or "4"
"-f" . to_string (),
"png" . to_string (),
"-m" . to_string (),
sanitize_external_tool_path ( & models_path ),
"-n" . to_string (),
model_name . to_string (), // "realesr-animevideov3-x2" or "x4"
"-j" . to_string (),
compute_upscale_threads ( ... ), // Dynamic thread configuration
"-g" . to_string (),
"0" . to_string (), // GPU ID
"-t" . to_string (),
"0" . to_string (), // Tile size (auto)
];
3. Re-encoding (Encode)
FFmpeg combines upscaled frames with original audio and re-encodes:
Implementation (upscale.rs:22-133):
pub ( crate ) fn build_upscale_encode_args (
output_frames_dir : & Path ,
source_file_path : & str ,
output_path : & str ,
source_fps : f64 ,
config : & ConversionConfig ,
pixel_format : Option < String >,
) -> Vec < String > {
let mut enc_args = vec! [
"-framerate" . to_string (),
source_fps . to_string (),
"-start_number" . to_string (),
"1" . to_string (),
"-i" . to_string (),
output_frames_dir . join ( "frame_%08d.png" ) . to_string_lossy () . to_string (),
];
// Second input: original file for audio/metadata
enc_args . push ( "-i" . to_string ());
enc_args . push ( source_file_path . to_string ());
// Map upscaled video from first input
enc_args . push ( "-map" . to_string ());
enc_args . push ( "0:v:0" . to_string ());
// Map audio from original file
enc_args . push ( "-map" . to_string ());
enc_args . push ( "1:a?" . to_string ());
// Apply video codec settings
add_video_codec_args ( & mut enc_args , config );
add_audio_codec_args ( & mut enc_args , config );
// ...
}
Performance Optimization
Dynamic Thread Configuration
Frame automatically configures concurrent processing threads based on output resolution to optimize GPU VRAM usage:
Implementation (upscale.rs:148-171):
pub ( crate ) fn compute_upscale_threads ( source_width : u32 , source_height : u32 , scale : u32 ) -> String {
let output_pixels = ( source_width as u64 * scale as u64 ) * ( source_height as u64 * scale as u64 );
// proc: concurrent GPU inference frames — limited by VRAM
// > 4K output (~8.3M px): ~500MB+ per frame → single concurrent frame
// > 1080p output (~2M px): moderate pressure → 2 concurrent frames
// ≤ 1080p output: lightweight, pipeline benefits from concurrency → 4
let proc = if output_pixels > 8_294_400 {
1
} else if output_pixels > 2_073_600 {
2
} else {
4
};
// load/save: I/O threads — limited by CPU cores
let cpus = std :: thread :: available_parallelism () . map ( | n | n . get () as u32 ) . unwrap_or ( 4 );
let io = cpus . div_ceil ( 2 ) . clamp ( 1 , 4 );
format! ( "{}:{}:{}" , io , proc , io )
}
Thread format: load_threads:proc_threads:save_threads
Hardware Decode Acceleration
Enable hardware decoding during frame extraction for faster processing:
config . hwDecode = true ;
config . mlUpscale = 'esrgan-2x' ;
Implementation (upscale.rs:313-322):
if task . config . hw_decode {
if crate :: conversion :: utils :: is_nvenc_codec ( & task . config . video_codec) {
dec_args . push ( "-hwaccel" . to_string ());
dec_args . push ( "cuda" . to_string ());
} else if crate :: conversion :: utils :: is_videotoolbox_codec ( & task . config . video_codec) {
dec_args . push ( "-hwaccel" . to_string ());
dec_args . push ( "videotoolbox" . to_string ());
}
}
Progress Tracking
AI upscaling provides detailed progress updates through three phases:
Progress distribution:
0-5% : Frame extraction (decode)
5-90% : AI upscaling
90-100% : Re-encoding
Implementation (upscale.rs:404-416, 531-542):
// Decode progress
let decode_progress = ( current_frame as f64 / total_frames as f64 ) * 5.0 ;
app_clone . emit ( "conversion-progress" , ProgressPayload {
id : id_clone . clone (),
progress : decode_progress . min ( 5.0 ),
});
// Upscale progress
let progress = 5.0 + ( completed_frames as f64 / total_frames as f64 ) * 85.0 ;
app_clone . emit ( "conversion-progress" , ProgressPayload {
id : id_clone . clone (),
progress : progress . min ( 90.0 ),
});
// Encode progress
let encode_progress = 90.0 + ( current_frame as f64 / total_frames as f64 ) * 10.0 ;
System Requirements
GPU Requirements
Real-ESRGAN requires GPU acceleration via:
Windows/Linux:
NVIDIA GPU with Vulkan support
AMD GPU with Vulkan support
Vulkan drivers installed
macOS:
Apple Silicon (M1/M2/M3) with Metal support
Intel Mac with Metal-capable GPU
VRAM Requirements
Minimum VRAM by output resolution:
Output Resolution 2x Mode 4x Mode 1080p (2MP) 2 GB 4 GB 1440p (3.7MP) 3 GB 6 GB 4K (8.3MP) 4 GB 8 GB 8K (33MP) 8 GB 16 GB
Insufficient VRAM will cause the upscaling process to fail. Frame automatically reduces concurrent processing for higher resolutions to minimize VRAM usage.
Runtime Validation
Frame validates the upscaling runtime before starting conversion:
Implementation (upscale.rs:173-236):
pub ( crate ) async fn validate_upscale_runtime (
app : & AppHandle ,
mode : & str ,
) -> Result <(), ConversionError > {
let ( _ , model_name ) = resolve_upscale_mode ( mode ) ? ;
// Check model files exist
let models_path = app . path () . resolve ( "resources/models" , BaseDirectory :: Resource ) ? ;
let model_param = models_path . join ( format! ( "{}.param" , model_name ));
let model_bin = models_path . join ( format! ( "{}.bin" , model_name ));
if ! model_param . is_file () || ! model_bin . is_file () {
return Err ( ConversionError :: InvalidInput (
format! ( "ML upscaling models are missing for '{}'. Run `bun run setup:upscaler` and rebuild." , mode )
));
}
// Test upscaler sidecar
let output = app . shell () . sidecar ( "realesrgan-ncnn-vulkan" ) ?
. args ([ "-h" ])
. output ()
. await ? ;
if ! output . status . success () {
// Validate help output to confirm binary works
// ...
}
Ok (())
}
Limitations and Constraints
Container Compatibility
// AI upscaling requires video container (not audio-only or GIF)
config . container = 'mp4' ; // ✓ Supported
config . container = 'mkv' ; // ✓ Supported
config . container = 'webm' ; // ✓ Supported
config . container = 'mov' ; // ✓ Supported
config . container = 'mp3' ; // ✗ Not supported (audio-only)
config . container = 'gif' ; // ✗ Not supported (video-only)
Validation (args.rs:566-570):
if ( is_audio_only || is_video_only ) && has_ml_upscale {
return Err ( ConversionError :: InvalidInput (
"ML upscaling requires an audio-capable video container" . to_string (),
));
}
Processing Mode
AI upscaling requires re-encoding mode:
config . processingMode = 'reencode' ; // ✓ Required
config . processingMode = 'copy' ; // ✗ Not compatible
Validation (args.rs:578-582):
if is_copy_mode && has_ml_upscale {
return Err ( ConversionError :: InvalidInput (
"ML upscaling requires re-encoding mode" . to_string (),
));
}
Combining with Other Features
AI upscaling works seamlessly with other Frame features:
With Video Filters
{
mlUpscale : 'esrgan-2x' ,
rotation : '90' , // Applied before upscaling
flipHorizontal : true , // Applied before upscaling
crop : { enabled : true , x : 0 , y : 0 , width : 1280 , height : 720 }
}
Filters are applied during frame extraction (upscale.rs:356-360).
With Hardware Encoding
{
mlUpscale : 'esrgan-4x' ,
videoCodec : 'h264_nvenc' ,
hwDecode : true ,
nvencSpatialAq : true
}
Combines GPU upscaling with GPU encoding for maximum performance.
With Audio Processing
{
mlUpscale : 'esrgan-2x' ,
audioNormalize : true ,
audioVolume : 110 ,
selectedAudioTracks : [ 0 ]
}
Audio processing is independent of upscaling.
Example Configurations
Upscale to 4K
{
container : 'mp4' ,
videoCodec : 'libx265' ,
videoBitrateMode : 'crf' ,
crf : 20 ,
mlUpscale : 'esrgan-2x' , // 1080p → 4K
preset : 'medium' ,
audioCodec : 'aac' ,
audioBitrate : '192'
}
Fast Hardware-Accelerated Upscale
{
container : 'mp4' ,
videoCodec : 'h264_nvenc' ,
videoBitrateMode : 'crf' ,
quality : 50 ,
mlUpscale : 'esrgan-2x' ,
hwDecode : true ,
nvencSpatialAq : true ,
preset : 'fast'
}
Archive Quality with 4x Upscale
{
container : 'mkv' ,
videoCodec : 'libx265' ,
videoBitrateMode : 'crf' ,
crf : 18 ,
mlUpscale : 'esrgan-4x' , // 540p → 4K
preset : 'slow' ,
audioCodec : 'flac'
}
Troubleshooting
Run the setup script to download Real-ESRGAN models: Then rebuild the application to bundle the models.
Upscaler sidecar unavailable
Ensure the realesrgan-ncnn-vulkan binary is included in the build:
Check tauri.conf.json includes the sidecar
Verify binary has execute permissions on Linux/macOS
On Windows, check antivirus isn’t blocking the executable
Out of memory / VRAM errors
The output resolution exceeds available VRAM. Try:
Use 2x instead of 4x mode
Upscale at lower source resolution
Close other GPU-intensive applications
Upgrade GPU memory
Related Features
Video Conversion Configure resolution and scaling algorithms
Batch Processing Upscale multiple videos efficiently
Presets Save upscaling configurations