LoHa and LoKr

In addition to standard LoRA, sd-scripts supports LoHa and LoKr as alternative parameter-efficient fine-tuning methods. Both are based on techniques from the LyCORIS project by KohakuBlueleaf and offer different mathematical structures for representing weight updates.

LoHa and LoKr are experimental features. They are confirmed to work in basic testing but may have edge cases or behavior changes in future releases. Report any issues you encounter.

What are LoHa and LoKr?

LoHa

Low-rank Hadamard Product. Represents weight updates as the element-wise (Hadamard) product of two low-rank matrix pairs. Based on FedPara (arXiv:2108.06098). Can capture more complex weight structures than LoRA at the same rank, with roughly twice the trainable parameters.

LoKr

Low-rank Kronecker Product. Represents weight updates using a Kronecker product with optional low-rank decomposition. Based on LoKr (arXiv:2309.14859). Tends to produce smaller models than LoRA at the same rank by factorizing weight dimensions.

How they work

LoHa

LoHa decomposes the weight update into two pairs of low-rank matrices combined via element-wise multiplication:

ΔW = (W1a × W1b) ⊙ (W2a × W2b)

Where W1a, W1b, W2a, W2b are all low-rank matrices with rank network_dim. The Hadamard (element-wise) product allows LoHa to represent more complex weight patterns than LoRA, but at the cost of approximately twice the parameters for the same rank. For Conv2d 3×3+ layers with Tucker decomposition enabled, each matrix pair additionally uses a Tucker tensor T, giving:

einsum("i j ..., j r, i p -> p r ...", T, Wb, Wa)

LoKr

LoKr decomposes the weight update using a Kronecker product:

ΔW = W1 ⊗ W2    (where W2 = W2a × W2b in low-rank mode)

The original weight dimensions are factorized — for example, a 512×512 weight might split into W1 (16×16) and W2 (32×32). W1 is always a full matrix (small), while W2 can be either low-rank decomposed or a full matrix depending on the rank setting. This factorization tends to produce smaller output files compared to LoRA at the same rank.

Comparison table

Feature	LoRA	LoHa	LoKr
Decomposition	Low-rank product	Hadamard of two low-rank pairs	Kronecker product
Parameters (same rank)	Baseline	~2× LoRA	Smaller than LoRA
File size	Medium	Larger	Smaller
Weight structure complexity	Standard	Higher	Different
Experimental	No	Yes	Yes

Supported architectures

Both LoHa and LoKr automatically detect the model architecture and apply appropriate defaults.

Architecture	Training targets
SDXL	`Transformer2DModel` (U-Net), `CLIPAttention`/`CLIPMLP` (text encoders). Conv2d layers in `ResnetBlock2D`, `Downsample2D`, `Upsample2D` are included when `conv_dim` is specified.
Anima	`Block`, `PatchEmbed`, `TimestepEmbedding`, `FinalLayer` (DiT); `Qwen3Attention`/`Qwen3MLP` (text encoder). Default `exclude_patterns` automatically skips modulation, normalization, embedder, and final_layer modules.

Training

To use LoHa or LoKr, change --network_module in your training command. All other options (dataset config, optimizer, scheduler, etc.) work the same as standard LoRA.

LoHa (SDXL)
LoKr (SDXL)
Anima

accelerate launch \
  --num_cpu_threads_per_process 1 \
  --mixed_precision bf16 \
  sdxl_train_network.py \
    --pretrained_model_name_or_path path/to/sdxl.safetensors \
    --dataset_config path/to/dataset.toml \
    --mixed_precision bf16 \
    --fp8_base \
    --optimizer_type adamw8bit \
    --learning_rate 2e-4 \
    --gradient_checkpointing \
    --network_module networks.loha \
    --network_dim 32 \
    --network_alpha 16 \
    --max_train_epochs 16 \
    --save_every_n_epochs 1 \
    --output_dir path/to/output \
    --output_name my-loha

accelerate launch \
  --num_cpu_threads_per_process 1 \
  --mixed_precision bf16 \
  sdxl_train_network.py \
    --pretrained_model_name_or_path path/to/sdxl.safetensors \
    --dataset_config path/to/dataset.toml \
    --mixed_precision bf16 \
    --fp8_base \
    --optimizer_type adamw8bit \
    --learning_rate 2e-4 \
    --gradient_checkpointing \
    --network_module networks.lokr \
    --network_dim 32 \
    --network_alpha 16 \
    --max_train_epochs 16 \
    --save_every_n_epochs 1 \
    --output_dir path/to/output \
    --output_name my-lokr

Replace sdxl_train_network.py with anima_train_network.py and point to your Anima model:

accelerate launch \
  --num_cpu_threads_per_process 1 \
  --mixed_precision bf16 \
  anima_train_network.py \
    --pretrained_model_name_or_path path/to/anima_model \
    --dataset_config path/to/dataset.toml \
    --mixed_precision bf16 \
    --optimizer_type adamw8bit \
    --learning_rate 2e-4 \
    --gradient_checkpointing \
    --network_module networks.loha \
    --network_dim 32 \
    --network_alpha 16 \
    --max_train_epochs 16 \
    --save_every_n_epochs 1 \
    --output_dir path/to/output \
    --output_name my-anima-loha

network_args options

Pass these via --network_args "key=value" on the command line.

Common options (both LoHa and LoKr)

Option	Description
`verbose=True`	Display detailed information about the network modules at startup
`rank_dropout=0.1`	Apply dropout to the rank dimension during training
`module_dropout=0.1`	Randomly skip entire modules during training
`exclude_patterns=[r'...']`	Exclude modules matching these regex patterns (added to architecture defaults)
`include_patterns=[r'...']`	Override excludes: include modules matching these regex patterns even if they match `exclude_patterns`
`network_reg_lrs=regex1=lr1,regex2=lr2`	Set per-module learning rates using regex patterns
`network_reg_dims=regex1=dim1,regex2=dim2`	Set per-module dimensions (rank) using regex patterns
`loraplus_lr_ratio=4`	Enable LoRA+ with this learning rate ratio

Conv2d support

By default, both methods target Linear and Conv2d 1×1 layers. To also train Conv2d 3×3+ layers (e.g., ResNet blocks in SDXL):

--network_args "conv_dim=16" "conv_alpha=8"

To enable Tucker decomposition for those Conv2d 3×3+ layers:

--network_args "conv_dim=16" "conv_alpha=8" "use_tucker=True"

Without use_tucker: Kernel dimensions are flattened into the input dimension (flat mode).
With use_tucker=True: A separate Tucker tensor handles kernel dimensions — more parameter-efficient in some cases.

LoKr-specific: `factor`

LoKr factorizes weight dimensions. The factor option controls how the split is done:

Value	Behavior
`-1` (default)	Automatically find balanced factors. E.g., dimension 512 → (16, 32).
Positive integer N	Force the split using this value. E.g., `factor=4` → dimension 512 splits as (4, 128).

--network_args "factor=4"

When network_dim is large enough relative to the factorized dimensions, LoKr uses a full matrix instead of low-rank decomposition for the second factor. A warning is logged in this case.

Anima-specific: `train_llm_adapter`

For Anima models, you can also train the LLM adapter modules:

--network_args "train_llm_adapter=True"

This includes LLMAdapterTransformerBlock modules as training targets.

Inference

Trained LoHa and LoKr weights are saved in safetensors format, just like standard LoRA weights.

SDXL
Anima

python gen_img.py \
  --ckpt path/to/sdxl.safetensors \
  --network_module networks.loha \
  --network_weights path/to/my-loha.safetensors \
  --prompt "your prompt here"

Replace networks.loha with networks.lokr for LoKr weights.

python anima_minimal_inference.py \
  --dit path/to/dit \
  --prompt "your prompt here" \
  --lora_weight path/to/loha_or_lokr.safetensors

LoRA, LoHa, and LoKr weights are automatically detected and merged.

Getting Started

Dataset Preparation

LoRA Training

Fine-tuning & Other Methods

Inference & Utilities

What are LoHa and LoKr?

LoHa

LoKr

How they work

LoHa

LoKr

Comparison table

Supported architectures

Training

network_args options

Common options (both LoHa and LoKr)

Conv2d support

LoKr-specific: `factor`

Anima-specific: `train_llm_adapter`

Inference

Build docs developers (and LLMs) love

Getting Started

Dataset Preparation

LoRA Training

Fine-tuning & Other Methods

Inference & Utilities

​What are LoHa and LoKr?

LoHa

LoKr

​How they work

​LoHa

​LoKr

​Comparison table

​Supported architectures

​Training

​network_args options

​Common options (both LoHa and LoKr)

​Conv2d support

​LoKr-specific: factor

​Anima-specific: train_llm_adapter

​Inference

Build docs developers (and LLMs) love

What are LoHa and LoKr?

How they work

LoHa

LoKr

Comparison table

Supported architectures

Training

network_args options

Common options (both LoHa and LoKr)

Conv2d support

LoKr-specific: `factor`

Anima-specific: `train_llm_adapter`

Inference