Anima support was added in sd-scripts v0.10.1. Model weights are available from the circlestone-labs/Anima repository on Hugging Face.
Supported training methods
| Method | Script | Supported |
|---|---|---|
| LoRA | anima_train_network.py | Yes |
| Full fine-tuning | anima_train.py | Yes |
| LoHa / LoKr | anima_train_network.py | Yes (experimental) |
| Textual Inversion | — | No |
| ControlNet | — | No |
Required model files
You need four components before training:| Component | Description | Source |
|---|---|---|
| Anima DiT | Base DiT model .safetensors | circlestone-labs/Anima |
| Qwen3-0.6B | Text encoder (HuggingFace dir or .safetensors) | Qwen3-0.6B |
| Qwen-Image VAE | VAE model .safetensors or .pth | circlestone-labs/Anima |
| LLM Adapter | 6-layer Transformer bridge (optional, loaded from DiT if bundled) | Bundled in DiT file |
Training command
Key Anima-specific arguments
Path to the Anima DiT model
.safetensors file. ComfyUI format with a net. key prefix is supported.Path to the Qwen3-0.6B text encoder. Can be a HuggingFace model directory or a single
.safetensors file. The text encoder is always frozen during training.Path to the Qwen-Image VAE
.safetensors or .pth file. The architecture is fixed: dim=96, z_dim=16.Path to a separate LLM adapter weights file. If omitted, the adapter is loaded from the DiT file when the key
llm_adapter.out_proj.weight exists.Timestep sampling method. Options:
sigma, uniform, sigmoid, shift, flux_shift. Same options as FLUX training.Shift for the timestep distribution. Used when
--timestep_sampling=shift.Process the VAE in chunks of this size to reduce memory usage. Recommended:
64.Disables the VAE cache to reduce memory usage. Use alongside
--vae_chunk_size.Attention implementation. Options:
torch, xformers, flash, sageattn. Note: sageattn is inference-only and cannot be used for training.Network module
Usenetworks.lora_anima as the --network_module:
