Overview
The sd-scripts training scripts expose a large set of advanced options beyond the basic--network_dim and --learning_rate flags. This page covers the most impactful options for users who want precise control over training behavior.
sdxl_train_network.py for illustration, but most options also apply to train_network.py, flux_train_network.py, and sd3_train_network.py.Block-wise LoRA dimensions and alphas
What block-wise LoRA does
What block-wise LoRA does
--network_dim) and alpha (--network_alpha). Block-wise settings let you assign different ranks to different parts of the network, which is useful when you want to concentrate the adapter’s capacity in specific layers.For SDXL, the U-Net has 23 blocks. You pass a comma-separated list of 23 integers to block_dims and block_alphas via --network_args.--network_dim / --network_alpha values.Adding Conv2d 3x3 block dimensions
Adding Conv2d 3x3 block dimensions
conv_block_dims and conv_block_alphas:LoRA+
What LoRA+ does
What LoRA+ does
DyLoRA
Training with DyLoRA
Training with DyLoRA
networks.dylora as the network module and specify the rank range with unit:unit up to network_dim by adjusting the LoRA multiplier.Learning rate schedulers
cosine
cosine
cosine_with_restarts
cosine_with_restarts
cosine, but restarts the cosine curve N times throughout training. Useful for escaping local minima.polynomial
polynomial
--lr_scheduler_power.constant_with_warmup
constant_with_warmup
Warmup ratio
Warmup ratio
--lr_warmup_steps is less than 1, it is interpreted as a fraction of the total number of training steps:Optimizer options
AdamW8bit (recommended default)
AdamW8bit (recommended default)
bitsandbytes library. Good balance between stability and VRAM usage.Adafactor
Adafactor
relative_step=True and the adafactor scheduler.Lion
Lion
lion-pytorch.Prodigy
Prodigy
1.0 and let Prodigy tune it during training.Passing extra optimizer arguments
Passing extra optimizer arguments
--optimizer_args to pass key=value pairs to the optimizer:Mixed precision
fp16 vs bf16
fp16 vs bf16
fp16 and bf16 reduce VRAM usage compared to full float32 training.| Format | Dynamic range | Precision | Best for |
|---|---|---|---|
fp16 | Smaller | Higher | SD 1.x/2.x, older GPUs |
bf16 | Larger | Lower | SDXL, FLUX.1, SD3; RTX 3000+, A100 |
Full half-precision training
Full half-precision training
--max_grad_norm=1.0.FP8 base model (experimental)
FP8 base model (experimental)
--fp8_base_unet loads only the U-Net in FP8, leaving text encoders in the default precision.Gradient checkpointing
Basic gradient checkpointing
Basic gradient checkpointing
Gradient accumulation
Gradient accumulation
train_batch_size × gradient_accumulation_steps.Gradient clipping
Gradient clipping
0 to disable gradient clipping entirely.Saving checkpoints
Periodic saving
Periodic saving
Keeping only the latest N checkpoints
Keeping only the latest N checkpoints
Saving optimizer state for resume
Saving optimizer state for resume
--save_state_on_train_end to save the state only at the end of a run.Resuming training
Resuming from a saved state
Resuming from a saved state
--resume to continue from a state directory saved by --save_state. This restores the optimizer state, step counter, and epoch counter.--resume restores the full training state. If you only want to start from existing LoRA weights (without restoring optimizer state), use --network_weights instead.Starting from existing LoRA weights
Starting from existing LoRA weights
--dim_from_weights to automatically read the rank from the weight file:Noise techniques
Noise offset
Noise offset
Multi-resolution noise
Multi-resolution noise
Min-SNR weighting (min_snr_gamma)
Min-SNR weighting (min_snr_gamma)
Input perturbation noise
Input perturbation noise
Network training scope
By default, both the U-Net and the text encoder receive LoRA modules. You can restrict training to one part:Weight norm scaling
Scale the magnitude of LoRA weights during training to help control overfitting:1.0 is a reasonable starting point.
Differential LoRA (merging existing weights)
Merge one or more existing LoRA files into the base model before starting a new training run. This lets you train the “difference” from an existing LoRA.Logging and tracking
TensorBoard
TensorBoard
Weights & Biases (wandb)
Weights & Biases (wandb)
pip install wandb before use.Logging the training config
Logging the training config
Using a config file instead of command-line arguments
For long training commands, store all arguments in a TOML file and pass it with--config_file:
--output_config to dump the current command-line arguments to a TOML file you can reuse later.