LoHa and LoKr Modules

LoHa and LoKr support is experimental. Behavior may change in future releases.

In addition to standard LoRA, sd-scripts supports LoHa and LoKr as alternative parameter-efficient fine-tuning methods. Both are based on techniques from the LyCORIS project by KohakuBlueleaf.

networks.loha

LoHa — Low-rank Hadamard Product. Represents weight updates as the element-wise product of two low-rank matrix pairs. Roughly twice the parameters of LoRA at the same rank, with greater expressivity.

networks.lokr

LoKr — Low-rank Kronecker Product. Represents weight updates using a Kronecker product with optional low-rank decomposition. Tends to produce smaller models than LoRA at the same rank.

How they work

LoHa

LoHa represents the weight update as a Hadamard (element-wise) product of two low-rank matrix pairs:

ΔW = (W1a × W1b) ⊙ (W2a × W2b)

W1a, W1b, W2a, and W2b are all low-rank matrices with rank network_dim. Because the update involves two independent pairs, LoHa has approximately twice the trainable parameters of LoRA at the same rank. This extra capacity lets it capture more complex weight interactions. For Conv2d 3×3+ layers with Tucker decomposition enabled, each matrix pair also includes a Tucker tensor T, and the reconstruction becomes:

einsum("i j ..., j r, i p -> p r ...", T, Wb, Wa)

LoKr

LoKr represents the weight update using a Kronecker product:

ΔW = W1 ⊗ W2    (where W2 = W2a × W2b in low-rank mode)

The original weight dimensions are factorized — for example, a 512×512 weight might be split so that W1 is 16×16 and W2 is 32×32. W1 is always a full matrix (small), while W2 is low-rank-decomposed unless network_dim is large enough relative to the factorized dimensions, in which case a full matrix is used for W2 automatically (a warning is logged in this case).

Comparison

Property	LoRA	LoHa	LoKr
Update formula	`W_up × W_down`	`(W1a×W1b) ⊙ (W2a×W2b)`	`W1 ⊗ W2`
Parameters at same rank	Baseline	~2× LoRA	Typically < LoRA
Model file size	Medium	Larger	Smaller
Architecture support	SD, FLUX, SD3, …	SDXL, Anima	SDXL, Anima
Conv2d 3×3 support	Yes (conv_dim)	Yes (conv_dim)	Yes (conv_dim)

Supported architectures

LoHa and LoKr automatically detect the model architecture and apply appropriate default targets.

SDXL: Targets Transformer2DModel for the UNet and CLIPAttention/CLIPMLP for text encoders. Conv2d layers in ResnetBlock2D, Downsample2D, and Upsample2D are also targeted when conv_dim is specified.
Anima: Targets Block, PatchEmbed, TimestepEmbedding, and FinalLayer for the DiT, and Qwen3Attention/Qwen3MLP for the text encoder. Default exclude_patterns automatically skip modulation, normalization, embedder, and final_layer modules.

Training

To use LoHa or LoKr, change --network_module in your training command. All other options (dataset config, optimizer, scheduler, etc.) are the same as LoRA.

LoHa (SDXL)
LoKr (SDXL)
Anima

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 sdxl_train_network.py \
  --pretrained_model_name_or_path path/to/sdxl.safetensors \
  --dataset_config path/to/dataset.toml \
  --mixed_precision bf16 --fp8_base \
  --optimizer_type adamw8bit \
  --learning_rate 2e-4 \
  --gradient_checkpointing \
  --network_module networks.loha \
  --network_dim 32 \
  --network_alpha 16 \
  --max_train_epochs 16 \
  --save_every_n_epochs 1 \
  --output_dir path/to/output \
  --output_name my-loha

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 sdxl_train_network.py \
  --pretrained_model_name_or_path path/to/sdxl.safetensors \
  --dataset_config path/to/dataset.toml \
  --mixed_precision bf16 --fp8_base \
  --optimizer_type adamw8bit \
  --learning_rate 2e-4 \
  --gradient_checkpointing \
  --network_module networks.lokr \
  --network_dim 32 \
  --network_alpha 16 \
  --max_train_epochs 16 \
  --save_every_n_epochs 1 \
  --output_dir path/to/output \
  --output_name my-lokr

Replace sdxl_train_network.py with anima_train_network.py and provide the appropriate model path and options. --network_module accepts either networks.loha or networks.lokr.

Network args

Pass options to LoHa and LoKr using --network_args. Each value is a quoted key=value string.

Conv2d extension

conv_dim

int

Rank for Conv2d 3×3 layers. When set, LoHa/LoKr is also applied to ResnetBlock2D, Downsample2D, and Upsample2D modules. Has no effect if omitted.

conv_alpha

float

Alpha scaling value for Conv2d 3×3 layers. Should be set alongside conv_dim.

Tucker decomposition (Conv2d 3×3+)

use_tucker

bool

default:"False"

Enable Tucker decomposition for Conv2d 3×3+ layers. Without Tucker, the kernel dimensions are flattened into the input dimension (flat mode). With use_tucker=True, a separate Tucker tensor handles the kernel dimensions, which is generally more parameter-efficient.

--network_args "conv_dim=16" "conv_alpha=8" "use_tucker=True"

Scalar parameter

use_scalar

bool

default:"False"

Train an additional scalar multiplier per module. The scalar adjusts the effective magnitude of each module’s output, giving the optimizer more flexibility.

Dropout

rank_dropout

float

Probability of zeroing individual rank dimensions during training. For example, rank_dropout=0.1 drops 10% of rank channels per forward pass. Has no effect at inference.

module_dropout

float

Probability of skipping an entire module for a given forward pass. When a module is dropped, the original pre-trained weight is used unchanged.

Module selection

exclude_patterns

string

List of regex patterns for module names to skip, in addition to any architecture defaults. For example, to skip all MLP layers:

--network_args "exclude_patterns=[r'.*mlp.*']"

include_patterns

string

Override excludes: modules matching these patterns are included even if they match exclude_patterns.

Per-module learning rates and dims

network_reg_lrs

string

Set per-module learning rates using regex patterns, in regex=lr format separated by commas. For example:

--network_args "network_reg_lrs=.*attn.*=5e-4,.*mlp.*=1e-4"

network_reg_dims

string

Set per-module rank (dim) using regex patterns, in regex=dim format separated by commas.

LoKr-specific: factor

factor

int

default:"-1"

Controls how LoKr factorizes weight dimensions for the Kronecker product.

-1 (default): Automatically find the most balanced factorization. For example, dimension 512 is split into (16, 32).
Positive integer: Force the first factor to exactly this value. For example, factor=4 splits dimension 512 into (4, 128).

--network_args "factor=4"

decompose_both

bool

default:"False"

When True, apply low-rank decomposition to both Kronecker factor matrices instead of only the second. This increases parameter count but can improve expressivity.

Anima-specific: LLM adapter

train_llm_adapter

bool

default:"False"

Include LLMAdapterTransformerBlock modules as training targets. Applies only to Anima models.

--network_args "train_llm_adapter=True"

LoRA+

LoRA+ is also supported with LoHa and LoKr. For LoHa, the second matrix pair (hada_w2_a) receives the higher learning rate. For LoKr, the scale factor (lokr_w1) receives the higher learning rate.

loraplus_lr_ratio

float

Multiplier for the “plus” parameter group relative to the base learning rate.

--network_args "loraplus_lr_ratio=4"

Inference

Trained LoHa and LoKr weights are saved in safetensors format, identical to LoRA. Load them the same way using --network_module and --network_weights.

SDXL

python gen_img.py \
  --ckpt path/to/sdxl.safetensors \
  --network_module networks.loha \
  --network_weights path/to/my-loha.safetensors \
  --prompt "your prompt" \
  ...

Replace networks.loha with networks.lokr for LoKr weights.

Anima

LoRA, LoHa, and LoKr weights are detected and merged automatically:

python anima_minimal_inference.py \
  --dit path/to/dit \
  --prompt "your prompt" \
  --lora_weight path/to/my-loha.safetensors \
  ...

ComfyUI conversion

To use Anima LoHa/LoKr weights in ComfyUI, convert them with the provided utility:

python networks/convert_anima_lora_to_comfy.py \
  --input path/to/my-loha.safetensors \
  --output path/to/my-loha-comfy.safetensors

Supported Models

Network Modules

LoHa and LoKr Modules

networks.loha

networks.lokr

How they work

LoHa

LoKr

Comparison

Supported architectures

Training

Network args

Conv2d extension

Tucker decomposition (Conv2d 3×3+)

Scalar parameter

Dropout

Module selection

Per-module learning rates and dims

LoKr-specific: factor

Anima-specific: LLM adapter

LoRA+

Inference

SDXL

Anima

ComfyUI conversion

Build docs developers (and LLMs) love

Supported Models

Network Modules

networks.loha

networks.lokr

​How they work

​LoHa

​LoKr

​Comparison

​Supported architectures

​Training

​Network args

​Conv2d extension

​Tucker decomposition (Conv2d 3×3+)

​Scalar parameter

​Dropout

​Module selection

​Per-module learning rates and dims

​LoKr-specific: factor

​Anima-specific: LLM adapter

​LoRA+

​Inference

​SDXL

​Anima

​ComfyUI conversion

Build docs developers (and LLMs) love

How they work

LoHa

LoKr

Comparison

Supported architectures

Training

Network args

Conv2d extension

Tucker decomposition (Conv2d 3×3+)

Scalar parameter

Dropout

Module selection

Per-module learning rates and dims

LoKr-specific: factor

Anima-specific: LLM adapter

LoRA+

Inference

SDXL

Anima

ComfyUI conversion