Overview
FLUX.2-klein is a 4B parameter image generation model optimized for Apple Silicon. This crate provides the Rust implementation using MLX bindings.Key features
- FLUX.2-klein transformer: 4B parameter model with 5 double-stream and 20 single-stream blocks
- Qwen3-4B text encoder: Shared with Z-Image-Turbo, produces 7680-dim embeddings
- VAE decoder: AutoencoderKL for latent-to-image decoding
- 4-bit quantization: Memory-efficient inference (~3GB vs ~8GB)
- Rectified flow sampling: Fast 4-step generation
Core types
FluxKlein
The main transformer model for FLUX.2-klein.Model configuration parameters
Methods
Create a new FLUX.2-klein transformer with the given parameters.
Model configuration. Use
FluxKleinParams::default() for standard 4B model.forward
fn(&mut self, img: &Array, txt: &Array, timesteps: &Array, img_ids: &Array, txt_ids: &Array) -> Result<Array, Exception>
Run forward pass through the transformer.
Image latents
[batch, seq, in_channels] where in_channels=128 (after patchify)Text embeddings
[batch, seq, 7680] from Qwen3TextEncoderDenoising timesteps
[batch] from 0.0 to 1.0Image position IDs
[batch, img_seq, 3] for 3-axis RoPEText position IDs
[batch, txt_seq, 3] for 3-axis RoPEPredicted velocity
[batch, img_seq, in_channels]forward_with_rope
fn(&mut self, img: &Array, txt: &Array, timesteps: &Array, rope_cos: &Array, rope_sin: &Array) -> Result<Array, Exception>
Pre-compute RoPE frequencies for caching. Call once before denoising loop.
Tuple of
(cos, sin) frequencies for efficient reuseFluxKleinParams
Configuration parameters for FLUX.2-klein.Input channels after patchify (32 VAE channels × 2×2 patch)
Model dimension
Text embedding dimension from Qwen3-4B
Number of double-stream transformer blocks
Number of single-stream transformer blocks
Qwen3TextEncoder
Text encoder using Qwen3-4B model.Create a new Qwen3 text encoder.
Model configuration. Use
Qwen3Config::default() for Qwen3-4B.forward
fn(&mut self, input_ids: &Array, attention_mask: Option<&Array>) -> Result<Array, Exception>
Decoder (VAE)
VAE decoder for converting latents to images.Create a new VAE decoder.
VAE configuration. Use
AutoEncoderConfig::flux2() for FLUX.2.Sampling
FluxSampler
Rectified flow sampler for denoising.Create a new sampler.
Sampler configuration
Create sampler for fast 4-step generation (FLUX.2-klein default).
FluxSamplerConfig
Sampler configuration.Number of inference steps
Whether to use fast linear schedule (true for FLUX.2-klein)
Time shift parameter for non-schnell models
Create config for FLUX.2-klein (4 steps, linear schedule).
Quantization
QuantizedFluxKlein
4-bit quantized FLUX.2-klein model for reduced memory usage.load_quantized_flux_klein
fn(weights: HashMap<String, Array>, params: FluxKleinParams) -> Result<QuantizedFluxKlein, Exception>
Weight utilities
Load weights from safetensors file.
Path to .safetensors file
Sanitize Qwen3 text encoder weights.
Sanitize VAE decoder weights from HuggingFace format.