Skip to main content
LoRA (Low-Rank Adaptation) is the primary fine-tuning method in sd-scripts. You select a specific LoRA module by passing its Python module path to --network_module. Different modules target different model architectures.

Available modules

ModuleFileTarget architecture
networks.loranetworks/lora.pySD 1.x / 2.x
networks.lora_fluxnetworks/lora_flux.pyFLUX.1
networks.lora_sd3networks/lora_sd3.pySD3 / SD3.5
networks.lora_luminanetworks/lora_lumina.pyLumina
networks.lora_hunyuan_imagenetworks/lora_hunyuan_image.pyHunyuanImage
networks.lora_animanetworks/lora_anima.pyAnima
networks.dyloranetworks/dylora.pySD 1.x / 2.x (DyLoRA)
networks.lora_fanetworks/lora_fa.pySD (FA variant)
Pass the module name to your training script with --network_module:
accelerate launch train_network.py \
  --network_module networks.lora \
  --network_dim 32 \
  --network_alpha 16 \
  ...

Core network args

You pass additional options to the network module using --network_args. Each value is a quoted key=value string.

Conv2d extension

By default, networks.lora targets only Linear and Conv2d 1×1 layers. To extend LoRA to Conv2d 3×3 layers (e.g., ResNet blocks in SD 1.x/2.x), set conv_dim and conv_alpha.
conv_dim
int
Rank for Conv2d 3×3 layers. When set, LoRA is applied to ResnetBlock2D, Downsample2D, and Upsample2D modules in addition to attention layers. Has no effect if omitted.
conv_alpha
float
default:"1.0"
Alpha scaling value for Conv2d 3×3 LoRA layers. Defaults to 1.0 when conv_dim is set but conv_alpha is not.

LoRA+ differential learning rates

LoRA+ applies a higher learning rate to the up-projection matrix (lora_up) than to the down-projection matrix (lora_down), which can improve convergence.
loraplus_lr_ratio
float
Multiplier applied to lora_up weights relative to the base learning rate. Applies to both UNet and text encoder. For example, loraplus_lr_ratio=4 sets the up-projection learning rate to 4× the base.
loraplus_unet_lr_ratio
float
LoRA+ ratio applied only to UNet modules. Overrides loraplus_lr_ratio for the UNet.
loraplus_text_encoder_lr_ratio
float
LoRA+ ratio applied only to text encoder modules. Overrides loraplus_lr_ratio for the text encoder.

Per-block dimensions

You can assign different ranks to each block in the UNet rather than using a single global rank. This lets you allocate more capacity to blocks that matter most for your use case.
block_dims
string
Comma-separated list of integer ranks, one per UNet block. For SD 1.x/2.x, provide 25 values (1 input + 12 down + 1 mid + 12 up). For SDXL, provide 23 values. Example: "4,4,4,4,8,8,8,8,16,16,16,16,16,16,16,16,16,16,16,8,8,8,8,4,4".
block_alphas
string
Comma-separated list of alpha values corresponding to each entry in block_dims. Defaults to the global --network_alpha for any block not explicitly set.
conv_block_dims
string
Comma-separated list of Conv2d 3×3 ranks per block, same length as block_dims. Requires block_dims to be set.
conv_block_alphas
string
Comma-separated list of Conv2d 3×3 alpha values per block, same length as block_dims.

Dropout

rank_dropout
float
Probability of zeroing individual rank dimensions during training. Operates on the hidden state after lora_down. For example, rank_dropout=0.1 drops 10% of rank channels per forward pass. Has no effect at inference.
module_dropout
float
Probability of skipping an entire LoRA module for a given forward pass during training. When a module is dropped, the original pre-trained weight is used unchanged. For example, module_dropout=0.1 skips each module 10% of the time.

Additional options

use_tucker
bool
default:"False"
Enable Tucker decomposition for Conv2d 3×3 LoRA layers. Tucker decomposition factors the kernel dimensions through a separate tensor, which can be more parameter-efficient than flattening the kernel into the input dimension.
use_scalar
bool
default:"False"
Train an additional scalar parameter per LoRA module. The scalar multiplies the LoRA output, giving the optimizer more flexibility to adjust the effective magnitude of each module’s contribution.
train_norm
bool
default:"False"
Include normalization layers (e.g., LayerNorm, GroupNorm) as training targets in addition to Linear and Conv2d layers. This can improve fidelity for certain fine-tuning tasks but increases the risk of overfitting.
dora_wd
bool
default:"False"
Enable DoRA (Weight-Decomposed Low-Rank Adaptation). DoRA decomposes the weight update into a magnitude component and a direction component (the LoRA matrices), similar to weight normalization. This can improve fine-tuning quality, especially for tasks that require significant style changes.

DyLoRA

networks.dylora implements DyLoRA (Dynamic Low-Rank Adaptation), which trains a nested set of ranks simultaneously. At inference you can extract a LoRA at any rank up to the maximum trained rank without retraining. Use networks.dylora in place of networks.lora and set the block size with --network_args dyloRA=N:
dyloRA
int
Block size for DyLoRA training. Ranks are trained in increments of this value up to --network_dim. For example, with --network_dim 32 and dyloRA=4, the module is trained at ranks 4, 8, 12, …, 32 simultaneously.
After training, use networks/extract_lora_from_dylora.py to extract a LoRA at a specific rank.

FLUX.1 LoRA (networks.lora_flux)

networks.lora_flux targets FLUX.1’s DoubleStreamBlock and SingleStreamBlock modules. It supports several FLUX-specific --network_args:
You can set separate ranks for different layer types within FLUX blocks:
ArgLayer type
img_attn_dimImage attention in DoubleStreamBlock
txt_attn_dimText attention in DoubleStreamBlock
img_mlp_dimImage MLP in DoubleStreamBlock
txt_mlp_dimText MLP in DoubleStreamBlock
img_mod_dimImage modulation in DoubleStreamBlock
txt_mod_dimText modulation in DoubleStreamBlock
single_dimLinear layers in SingleStreamBlock
single_mod_dimModulation in SingleStreamBlock
Train only specific blocks using indices:
--network_args "train_double_block_indices=0-9" "train_single_block_indices=0-19"
train_blocks=double or train_blocks=single restricts training to only that block type.
By default, only the CLIP text encoder is trained. To also train the T5XXL encoder:
--network_args "train_t5xxl=True"
FLUX combines Q, K, and V into a single projection. Set split_qkv=True to train them with separate LoRA adapters:
--network_args "split_qkv=True"

Full example

The following command trains a LoRA for SD 1.x with Conv2d extension, LoRA+, and rank dropout:
accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 train_network.py \
  --pretrained_model_name_or_path path/to/sd15.safetensors \
  --dataset_config path/to/dataset.toml \
  --mixed_precision bf16 \
  --optimizer_type adamw8bit \
  --learning_rate 1e-4 \
  --gradient_checkpointing \
  --network_module networks.lora \
  --network_dim 32 \
  --network_alpha 16 \
  --network_args \
    "conv_dim=16" \
    "conv_alpha=8" \
    "loraplus_lr_ratio=4" \
    "rank_dropout=0.1" \
    "dora_wd=True" \
  --max_train_epochs 10 \
  --save_every_n_epochs 1 \
  --output_dir path/to/output \
  --output_name my-lora
For FLUX.1, replace the script with flux_train_network.py, the model path with your FLUX checkpoint, and --network_module with networks.lora_flux.

Build docs developers (and LLMs) love