networks.loha
LoHa — Low-rank Hadamard Product. Represents weight updates as the element-wise product of two low-rank matrix pairs. Roughly twice the parameters of LoRA at the same rank, with greater expressivity.
networks.lokr
LoKr — Low-rank Kronecker Product. Represents weight updates using a Kronecker product with optional low-rank decomposition. Tends to produce smaller models than LoRA at the same rank.
How they work
LoHa
LoHa represents the weight update as a Hadamard (element-wise) product of two low-rank matrix pairs:W1a, W1b, W2a, and W2b are all low-rank matrices with rank network_dim. Because the update involves two independent pairs, LoHa has approximately twice the trainable parameters of LoRA at the same rank. This extra capacity lets it capture more complex weight interactions.
For Conv2d 3×3+ layers with Tucker decomposition enabled, each matrix pair also includes a Tucker tensor T, and the reconstruction becomes:
LoKr
LoKr represents the weight update using a Kronecker product:W1 is 16×16 and W2 is 32×32. W1 is always a full matrix (small), while W2 is low-rank-decomposed unless network_dim is large enough relative to the factorized dimensions, in which case a full matrix is used for W2 automatically (a warning is logged in this case).
Comparison
| Property | LoRA | LoHa | LoKr |
|---|---|---|---|
| Update formula | W_up × W_down | (W1a×W1b) ⊙ (W2a×W2b) | W1 ⊗ W2 |
| Parameters at same rank | Baseline | ~2× LoRA | Typically < LoRA |
| Model file size | Medium | Larger | Smaller |
| Architecture support | SD, FLUX, SD3, … | SDXL, Anima | SDXL, Anima |
| Conv2d 3×3 support | Yes (conv_dim) | Yes (conv_dim) | Yes (conv_dim) |
Supported architectures
LoHa and LoKr automatically detect the model architecture and apply appropriate default targets.- SDXL: Targets
Transformer2DModelfor the UNet andCLIPAttention/CLIPMLPfor text encoders. Conv2d layers inResnetBlock2D,Downsample2D, andUpsample2Dare also targeted whenconv_dimis specified. - Anima: Targets
Block,PatchEmbed,TimestepEmbedding, andFinalLayerfor the DiT, andQwen3Attention/Qwen3MLPfor the text encoder. Defaultexclude_patternsautomatically skip modulation, normalization, embedder, and final_layer modules.
Training
To use LoHa or LoKr, change--network_module in your training command. All other options (dataset config, optimizer, scheduler, etc.) are the same as LoRA.
- LoHa (SDXL)
- LoKr (SDXL)
- Anima
Network args
Pass options to LoHa and LoKr using--network_args. Each value is a quoted key=value string.
Conv2d extension
Rank for Conv2d 3×3 layers. When set, LoHa/LoKr is also applied to
ResnetBlock2D, Downsample2D, and Upsample2D modules. Has no effect if omitted.Alpha scaling value for Conv2d 3×3 layers. Should be set alongside
conv_dim.Tucker decomposition (Conv2d 3×3+)
Enable Tucker decomposition for Conv2d 3×3+ layers. Without Tucker, the kernel dimensions are flattened into the input dimension (flat mode). With
use_tucker=True, a separate Tucker tensor handles the kernel dimensions, which is generally more parameter-efficient.Scalar parameter
Train an additional scalar multiplier per module. The scalar adjusts the effective magnitude of each module’s output, giving the optimizer more flexibility.
Dropout
Probability of zeroing individual rank dimensions during training. For example,
rank_dropout=0.1 drops 10% of rank channels per forward pass. Has no effect at inference.Probability of skipping an entire module for a given forward pass. When a module is dropped, the original pre-trained weight is used unchanged.
Module selection
List of regex patterns for module names to skip, in addition to any architecture defaults. For example, to skip all MLP layers:
Override excludes: modules matching these patterns are included even if they match
exclude_patterns.Per-module learning rates and dims
Set per-module learning rates using regex patterns, in
regex=lr format separated by commas. For example:Set per-module rank (dim) using regex patterns, in
regex=dim format separated by commas.LoKr-specific: factor
Controls how LoKr factorizes weight dimensions for the Kronecker product.
-1(default): Automatically find the most balanced factorization. For example, dimension 512 is split into (16, 32).- Positive integer: Force the first factor to exactly this value. For example,
factor=4splits dimension 512 into (4, 128).
When
True, apply low-rank decomposition to both Kronecker factor matrices instead of only the second. This increases parameter count but can improve expressivity.Anima-specific: LLM adapter
Include
LLMAdapterTransformerBlock modules as training targets. Applies only to Anima models.LoRA+
LoRA+ is also supported with LoHa and LoKr. For LoHa, the second matrix pair (hada_w2_a) receives the higher learning rate. For LoKr, the scale factor (lokr_w1) receives the higher learning rate.
Multiplier for the “plus” parameter group relative to the base learning rate.
Inference
Trained LoHa and LoKr weights are saved in safetensors format, identical to LoRA. Load them the same way using--network_module and --network_weights.
SDXL
networks.loha with networks.lokr for LoKr weights.
