Performance Overview
kimg is built for speed. Originally extracted from the Spriteform compositor (which was pure JS), the Rust+WASM version runs 5-15x faster than the original JavaScript implementation.SIMD Support
The build process generates two WASM binaries to maximize performance across different runtime environments:- kimg_wasm_bg.wasm - Baseline WASM target for maximum compatibility
- kimg_wasm_simd_bg.wasm - SIMD-enabled build with
simd128instructions for runtimes that support it
fast_image_resize Integration
For RGBA bilinear and Lanczos3 resize operations, kimg uses thefast_image_resize crate, which provides:
- Host SIMD acceleration on native builds (SSE4.1, AVX2, NEON)
- WASM SIMD support in the browser when the
simd128artifact is loaded - Optimized resize algorithms that outperform naive implementations
- High-quality image scaling
- Large image resizing (e.g., 2048×2048 → 4096×4096)
- Batch resize operations
Optimization Strategies
Transform Caching
kimg caches transformed layer renders to avoid redundant computation. When the same transformed layer is rendered multiple times without changes:- First render performs the full transform calculation
- Subsequent renders use the cached result
- Cache is invalidated when layer properties change
Blend Mode Performance
Different blend modes have varying performance characteristics:- Normal blend (Porter-Duff source-over) is the fastest
- Simple modes (Multiply, Screen, Darken, Lighten) have minimal overhead
- Complex modes (ColorDodge, ColorBurn, SoftLight) involve more computation
- HSL-based modes (Hue, Saturation, Color, Luminosity) require color space conversion
Convolution Kernel Optimization
Convolution-based filters (blur, sharpen, edge detect) scale with kernel size:- 3×3 kernels are fastest for simple effects
- 5×5 kernels provide better quality at higher cost
- Box blur is optimized for speed over quality
- Gaussian blur provides high quality with acceptable performance
Filter Pipeline
Filters applied to groups affect all child layers. For optimal performance:- Apply filters to individual layers when possible
- Use group-level filters only when the effect should apply to the composite
- Minimize the number of filter layers in the render pipeline
Shape Rasterization
Shape layers are rasterized on-demand:- Simple shapes (rectangles, ellipses) are very fast
- Polygons scale with vertex count and complexity
- Document-level caching reduces repeated rasterization cost
- Prefer primitive shapes over complex polygons when possible
Memory Considerations
Buffer Management
Each layer maintains its own RGBA buffer:- Memory usage scales with:
width × height × 4 bytes × layer_count - A 512×512 10-layer composition uses ~10 MB
- A 2048×2048 10-layer composition uses ~160 MB
WASM Memory Limits
WASM has a default 2GB memory limit. For very large compositions:- Monitor memory usage in long-running applications
- Dispose of unused compositions to free memory
- Consider tiling or streaming for extremely large images