Skip to main content

Overview

Pass a TOML (or JSON) config file to any training script with --dataset_config:
accelerate launch train_network.py \
  --dataset_config dataset.toml \
  ...other args
The config file replaces the --train_data_dir, --reg_data_dir, and --in_json command-line arguments. When an option exists in both places, the config file value takes priority.

Configuration structure

Settings are organized into three scopes that cascade from general to specific:
[general]                  ← applies to all datasets and subsets
[[datasets]]               ← applies to one dataset
  [[datasets.subsets]]     ← applies to one image directory
  [[datasets.subsets]]
[[datasets]]
  [[datasets.subsets]]
More specific scopes override less specific ones. For example, a keep_tokens value in [[datasets.subsets]] overrides the value in [[datasets]], which overrides [general].

Complete example

[general]
shuffle_caption = true
caption_extension = ".txt"
keep_tokens = 1

# DreamBooth-style dataset at 512 × 512
[[datasets]]
resolution = 512
batch_size = 4
enable_bucket = true
keep_tokens = 2

  [[datasets.subsets]]
  image_dir = "/data/my_character"
  class_tokens = "sks girl"
  num_repeats = 10

  [[datasets.subsets]]
  image_dir = "/data/my_character2"
  class_tokens = "fuga boy"
  keep_tokens = 3          # overrides the dataset-level value of 2

  [[datasets.subsets]]
  is_reg = true
  image_dir = "/data/reg_human"
  class_tokens = "person"
  keep_tokens = 1

# Fine-tuning-style dataset at 768 × 768
[[datasets]]
resolution = [768, 768]
batch_size = 2

  [[datasets.subsets]]
  image_dir = "/data/my_finetuning"
  metadata_file = "/data/my_finetuning/metadata.json"
  # keep_tokens = 1  (inherited from [general])

[general] section

Options in [general] apply to every dataset and subset unless overridden at a lower scope.
general.shuffle_caption
boolean
default:"false"
Randomly shuffle the comma-separated tags in each caption before training. Helps the model learn tags independently rather than as positional sequences.
general.caption_extension
string
default:"\".txt\""
File extension for caption sidecar files. Common values are ".txt" and ".caption".
general.keep_tokens
number
default:"0"
Number of tokens at the start of each caption to keep in place when shuffle_caption is enabled. Set to 1 to keep the trigger word first.
general.enable_bucket
boolean
default:"false"
Enable aspect ratio bucketing across all datasets. Images are grouped into resolution buckets to preserve their original proportions.
general.resolution
number | [number, number]
Training resolution. Accepts a single integer (square) or a [width, height] pair. Can be overridden per dataset.
general.batch_size
number
default:"1"
Number of images per training step. Equivalent to --train_batch_size.

[[datasets]] options

Each [[datasets]] block defines one dataset. Subsets nested inside share these settings.

Resolution and batching

datasets.resolution
number | [number, number]
required
Training resolution for this dataset. Use a single integer for a square (e.g. 512) or a [width, height] array for a rectangle (e.g. [768, 512]).
datasets.batch_size
number
default:"1"
Images per training step for this dataset. Equivalent to --train_batch_size.

Aspect ratio bucketing

datasets.enable_bucket
boolean
default:"false"
Enable aspect ratio bucketing for this dataset. When enabled, images are resized to the nearest bucket resolution to preserve proportions.
datasets.bucket_reso_steps
number
default:"64"
Step size in pixels between bucket resolutions. All min_bucket_reso and max_bucket_reso values must be divisible by this number.
datasets.min_bucket_reso
number
default:"256"
Minimum bucket resolution (shortest side). Must be divisible by bucket_reso_steps.
datasets.max_bucket_reso
number
default:"1024"
Maximum bucket resolution (longest side). Must be divisible by bucket_reso_steps.
datasets.bucket_no_upscale
boolean
default:"false"
When true, images smaller than a bucket are not upscaled to fill it. Recommended for datasets that mix large and small images.
datasets.skip_image_resolution
number | [number, number]
Skip images whose original area is at or below this resolution. Useful when the same directory is shared across multiple datasets at different resolutions — prevents small images from appearing in high-resolution datasets.

[[datasets.subsets]] options

Each subset points to one image directory. Multiple subsets can belong to the same dataset.

Common options

datasets.subsets.image_dir
string
required
Absolute path to the image directory. Images must be placed directly inside this directory — subdirectories are not scanned.
datasets.subsets.num_repeats
number
default:"1"
Number of times to repeat each image per epoch. Equivalent to --dataset_repeats for fine-tuning. Use higher values for small subsets to balance training time.
datasets.subsets.flip_aug
boolean
default:"false"
Randomly flip images horizontally during training. Do not use for asymmetric subjects (text, faces, characters with distinctive left/right features).
datasets.subsets.color_aug
boolean
default:"false"
Apply random color jitter during training. Incompatible with latent caching.
datasets.subsets.shuffle_caption
boolean
default:"false"
Shuffle caption tags for images in this subset. Overrides the [general] setting.
datasets.subsets.keep_tokens
number
default:"0"
Number of tags at the start of each caption to keep fixed when shuffling. Overrides higher-scope settings.
datasets.subsets.keep_tokens_separator
string
A delimiter that splits a caption into a fixed prefix, a shuffled/dropped middle, and a fixed suffix. For example, with "|||", the caption "trigger ||| tag1, tag2 ||| quality tags" keeps trigger and quality tags fixed while shuffling the middle.
datasets.subsets.caption_extension
string
default:"\".txt\""
Caption file extension for this subset.
datasets.subsets.caption_prefix
string
String prepended to every caption. Included when shuffling.
datasets.subsets.caption_suffix
string
String appended to every caption. Included when shuffling.
datasets.subsets.caption_separator
string
default:"\", \""
Separator between tags in the caption. Normally you do not need to change this.
datasets.subsets.secondary_separator
string
An additional separator. Tags grouped by this separator are treated as a single unit for shuffling and dropout. For example, "sky;;;cloud;;;day" with secondary_separator = ";;;" becomes "sky,cloud,day" and is shuffled or dropped as one tag.
datasets.subsets.enable_wildcard
boolean
default:"false"
Enable wildcard and multi-line caption notation. With wildcards, {simple|white} background randomly picks one value. With multi-line captions, one line is selected per step.
datasets.subsets.random_crop
boolean
default:"false"
Randomly crop images instead of center-cropping. Cannot be used with enable_bucket.
datasets.subsets.cache_info
boolean
default:"false"
Cache image dimensions and captions to metadata_cache.json in image_dir. Speeds up subsequent runs on large datasets.

DreamBooth-specific options

datasets.subsets.class_tokens
string
Class tokens (trigger words) for this subset. Used as the caption when no caption file exists for an image. If neither class_tokens nor a caption file is found for an image, training will error.
datasets.subsets.is_reg
boolean
default:"false"
Mark this subset as a regularization (prior-preservation) subset. Regularization images are used to prevent language drift and are not the target of fine-tuning.

Fine-tuning-specific options

datasets.subsets.metadata_file
string
required
Path to the JSON metadata file for this subset. Required for fine-tuning-style subsets. The file maps image paths to captions and tags. Equivalent to --in_json.

Caption dropout options

These options control caption dropout, which trains the model to work with and without captions.
datasets.subsets.caption_dropout_rate
number
default:"0"
Probability (0–1) that the entire caption is dropped for a given image step.
datasets.subsets.caption_dropout_every_n_epochs
number
Drop all captions every N epochs.
datasets.subsets.caption_tag_dropout_rate
number
default:"0"
Probability (0–1) that each individual tag is dropped from the caption.

Dataset style examples

DreamBooth style

Use when you have images in a directory and want to associate them with a trigger word. Caption files are optional.
[general]
shuffle_caption = true
keep_tokens = 1

[[datasets]]
resolution = 512
batch_size = 4
enable_bucket = true
min_bucket_reso = 256
max_bucket_reso = 1024

  [[datasets.subsets]]
  image_dir = "/data/sks_dog"
  class_tokens = "sks dog"
  num_repeats = 10
  flip_aug = true

  [[datasets.subsets]]
  is_reg = true
  image_dir = "/data/reg_dog"
  class_tokens = "dog"
  num_repeats = 1

Fine-tuning style

Use when you have a pre-built metadata JSON file (generated by merge_captions_to_metadata.py or similar).
[general]
shuffle_caption = true
keep_tokens = 1

[[datasets]]
resolution = 1024
batch_size = 2
enable_bucket = true
min_bucket_reso = 512
max_bucket_reso = 2048
bucket_reso_steps = 64

  [[datasets.subsets]]
  image_dir = "/data/my_dataset"
  metadata_file = "/data/my_dataset/metadata.json"
  num_repeats = 1

Mixed style (DreamBooth + fine-tuning)

Both dataset styles can coexist in a single config. Each style must be in its own [[datasets]] block.
[general]
shuffle_caption = true
caption_extension = ".txt"
keep_tokens = 1

# DreamBooth-style dataset at 512 × 512
[[datasets]]
resolution = 512
batch_size = 4

  [[datasets.subsets]]
  image_dir = "/data/my_character"
  class_tokens = "sks girl"
  num_repeats = 10

# Fine-tuning-style dataset at 768 × 768
[[datasets]]
resolution = [768, 768]
batch_size = 2

  [[datasets.subsets]]
  image_dir = "/data/general_images"
  metadata_file = "/data/general_images/metadata.json"

Multi-resolution with skip_image_resolution

Train the same images at multiple resolutions and exclude small images from high-resolution datasets:
[general]
enable_bucket = true
bucket_no_upscale = true
max_bucket_reso = 1536

[[datasets]]
resolution = 768
  [[datasets.subsets]]
  image_dir = "/data/my_images"

[[datasets]]
resolution = 1024
skip_image_resolution = 768
  [[datasets.subsets]]
  image_dir = "/data/my_images"

[[datasets]]
resolution = 1280
skip_image_resolution = 1024
  [[datasets.subsets]]
  image_dir = "/data/my_images"

Duplicate subset handling

If two subsets in the same dataset point to the same image_dir (DreamBooth) or metadata_file (fine-tuning), the second is ignored. Subsets in different datasets pointing to the same directory are not considered duplicates and are both used — this is how multi-resolution training works.

Command-line arguments overridden by config

When --dataset_config is provided, these command-line arguments are ignored entirely:
  • --train_data_dir
  • --reg_data_dir
  • --in_json
For other overlapping options (e.g. --resolution, --batch_size, --shuffle_caption), the config file value takes priority over the command-line value.

Common errors

ErrorCause
required key not provided @ data['datasets'][0]['subsets'][0]['image_dir']image_dir is missing from a subset
expected int for dictionary valueA numeric option has the wrong type (e.g. a string instead of a number)
extra keys not allowedAn option name is misspelled or not supported at that scope level

Build docs developers (and LLMs) love