Dataset Configuration

Overview

Pass a TOML (or JSON) config file to any training script with --dataset_config:

accelerate launch train_network.py \
  --dataset_config dataset.toml \
  ...other args

The config file replaces the --train_data_dir, --reg_data_dir, and --in_json command-line arguments. When an option exists in both places, the config file value takes priority.

Configuration structure

Settings are organized into three scopes that cascade from general to specific:

[general]                  ← applies to all datasets and subsets
[[datasets]]               ← applies to one dataset
  [[datasets.subsets]]     ← applies to one image directory
  [[datasets.subsets]]
[[datasets]]
  [[datasets.subsets]]

More specific scopes override less specific ones. For example, a keep_tokens value in [[datasets.subsets]] overrides the value in [[datasets]], which overrides [general].

Complete example

[general]
shuffle_caption = true
caption_extension = ".txt"
keep_tokens = 1

# DreamBooth-style dataset at 512 × 512
[[datasets]]
resolution = 512
batch_size = 4
enable_bucket = true
keep_tokens = 2

  [[datasets.subsets]]
  image_dir = "/data/my_character"
  class_tokens = "sks girl"
  num_repeats = 10

  [[datasets.subsets]]
  image_dir = "/data/my_character2"
  class_tokens = "fuga boy"
  keep_tokens = 3          # overrides the dataset-level value of 2

  [[datasets.subsets]]
  is_reg = true
  image_dir = "/data/reg_human"
  class_tokens = "person"
  keep_tokens = 1

# Fine-tuning-style dataset at 768 × 768
[[datasets]]
resolution = [768, 768]
batch_size = 2

  [[datasets.subsets]]
  image_dir = "/data/my_finetuning"
  metadata_file = "/data/my_finetuning/metadata.json"
  # keep_tokens = 1  (inherited from [general])

[general] section

Options in [general] apply to every dataset and subset unless overridden at a lower scope.

general.shuffle_caption

boolean

default:"false"

Randomly shuffle the comma-separated tags in each caption before training. Helps the model learn tags independently rather than as positional sequences.

general.caption_extension

string

default:"\".txt\""

File extension for caption sidecar files. Common values are ".txt" and ".caption".

general.keep_tokens

number

default:"0"

Number of tokens at the start of each caption to keep in place when shuffle_caption is enabled. Set to 1 to keep the trigger word first.

general.enable_bucket

boolean

default:"false"

Enable aspect ratio bucketing across all datasets. Images are grouped into resolution buckets to preserve their original proportions.

general.resolution

number | [number, number]

Training resolution. Accepts a single integer (square) or a [width, height] pair. Can be overridden per dataset.

general.batch_size

number

default:"1"

Number of images per training step. Equivalent to --train_batch_size.

[[datasets]] options

Each [[datasets]] block defines one dataset. Subsets nested inside share these settings.

Resolution and batching

datasets.resolution

number | [number, number]

required

Training resolution for this dataset. Use a single integer for a square (e.g. 512) or a [width, height] array for a rectangle (e.g. [768, 512]).

datasets.batch_size

number

default:"1"

Images per training step for this dataset. Equivalent to --train_batch_size.

Aspect ratio bucketing

datasets.enable_bucket

boolean

default:"false"

Enable aspect ratio bucketing for this dataset. When enabled, images are resized to the nearest bucket resolution to preserve proportions.

datasets.bucket_reso_steps

number

default:"64"

Step size in pixels between bucket resolutions. All min_bucket_reso and max_bucket_reso values must be divisible by this number.

datasets.min_bucket_reso

number

default:"256"

Minimum bucket resolution (shortest side). Must be divisible by bucket_reso_steps.

datasets.max_bucket_reso

number

default:"1024"

Maximum bucket resolution (longest side). Must be divisible by bucket_reso_steps.

datasets.bucket_no_upscale

boolean

default:"false"

When true, images smaller than a bucket are not upscaled to fill it. Recommended for datasets that mix large and small images.

datasets.skip_image_resolution

number | [number, number]

Skip images whose original area is at or below this resolution. Useful when the same directory is shared across multiple datasets at different resolutions — prevents small images from appearing in high-resolution datasets.

[[datasets.subsets]] options

Each subset points to one image directory. Multiple subsets can belong to the same dataset.

Common options

datasets.subsets.image_dir

string

required

Absolute path to the image directory. Images must be placed directly inside this directory — subdirectories are not scanned.

datasets.subsets.num_repeats

number

default:"1"

Number of times to repeat each image per epoch. Equivalent to --dataset_repeats for fine-tuning. Use higher values for small subsets to balance training time.

datasets.subsets.flip_aug

boolean

default:"false"

Randomly flip images horizontally during training. Do not use for asymmetric subjects (text, faces, characters with distinctive left/right features).

datasets.subsets.color_aug

boolean

default:"false"

Apply random color jitter during training. Incompatible with latent caching.

datasets.subsets.shuffle_caption

boolean

default:"false"

Shuffle caption tags for images in this subset. Overrides the [general] setting.

datasets.subsets.keep_tokens

number

default:"0"

Number of tags at the start of each caption to keep fixed when shuffling. Overrides higher-scope settings.

datasets.subsets.keep_tokens_separator

string

A delimiter that splits a caption into a fixed prefix, a shuffled/dropped middle, and a fixed suffix. For example, with "|||", the caption "trigger ||| tag1, tag2 ||| quality tags" keeps trigger and quality tags fixed while shuffling the middle.

datasets.subsets.caption_extension

string

default:"\".txt\""

Caption file extension for this subset.

datasets.subsets.caption_prefix

string

String prepended to every caption. Included when shuffling.

datasets.subsets.caption_suffix

string

String appended to every caption. Included when shuffling.

datasets.subsets.caption_separator

string

default:"\", \""

Separator between tags in the caption. Normally you do not need to change this.

datasets.subsets.secondary_separator

string

An additional separator. Tags grouped by this separator are treated as a single unit for shuffling and dropout. For example, "sky;;;cloud;;;day" with secondary_separator = ";;;" becomes "sky,cloud,day" and is shuffled or dropped as one tag.

datasets.subsets.enable_wildcard

boolean

default:"false"

Enable wildcard and multi-line caption notation. With wildcards, {simple|white} background randomly picks one value. With multi-line captions, one line is selected per step.

datasets.subsets.random_crop

boolean

default:"false"

Randomly crop images instead of center-cropping. Cannot be used with enable_bucket.

datasets.subsets.cache_info

boolean

default:"false"

Cache image dimensions and captions to metadata_cache.json in image_dir. Speeds up subsequent runs on large datasets.

DreamBooth-specific options

datasets.subsets.class_tokens

string

Class tokens (trigger words) for this subset. Used as the caption when no caption file exists for an image. If neither class_tokens nor a caption file is found for an image, training will error.

datasets.subsets.is_reg

boolean

default:"false"

Mark this subset as a regularization (prior-preservation) subset. Regularization images are used to prevent language drift and are not the target of fine-tuning.

Fine-tuning-specific options

datasets.subsets.metadata_file

string

required

Path to the JSON metadata file for this subset. Required for fine-tuning-style subsets. The file maps image paths to captions and tags. Equivalent to --in_json.

Caption dropout options

These options control caption dropout, which trains the model to work with and without captions.

datasets.subsets.caption_dropout_rate

number

default:"0"

Probability (0–1) that the entire caption is dropped for a given image step.

datasets.subsets.caption_dropout_every_n_epochs

number

Drop all captions every N epochs.

datasets.subsets.caption_tag_dropout_rate

number

default:"0"

Probability (0–1) that each individual tag is dropped from the caption.

Dataset style examples

DreamBooth style

Use when you have images in a directory and want to associate them with a trigger word. Caption files are optional.

[general]
shuffle_caption = true
keep_tokens = 1

[[datasets]]
resolution = 512
batch_size = 4
enable_bucket = true
min_bucket_reso = 256
max_bucket_reso = 1024

  [[datasets.subsets]]
  image_dir = "/data/sks_dog"
  class_tokens = "sks dog"
  num_repeats = 10
  flip_aug = true

  [[datasets.subsets]]
  is_reg = true
  image_dir = "/data/reg_dog"
  class_tokens = "dog"
  num_repeats = 1

Fine-tuning style

Use when you have a pre-built metadata JSON file (generated by merge_captions_to_metadata.py or similar).

[general]
shuffle_caption = true
keep_tokens = 1

[[datasets]]
resolution = 1024
batch_size = 2
enable_bucket = true
min_bucket_reso = 512
max_bucket_reso = 2048
bucket_reso_steps = 64

  [[datasets.subsets]]
  image_dir = "/data/my_dataset"
  metadata_file = "/data/my_dataset/metadata.json"
  num_repeats = 1

Mixed style (DreamBooth + fine-tuning)

Both dataset styles can coexist in a single config. Each style must be in its own [[datasets]] block.

[general]
shuffle_caption = true
caption_extension = ".txt"
keep_tokens = 1

# DreamBooth-style dataset at 512 × 512
[[datasets]]
resolution = 512
batch_size = 4

  [[datasets.subsets]]
  image_dir = "/data/my_character"
  class_tokens = "sks girl"
  num_repeats = 10

# Fine-tuning-style dataset at 768 × 768
[[datasets]]
resolution = [768, 768]
batch_size = 2

  [[datasets.subsets]]
  image_dir = "/data/general_images"
  metadata_file = "/data/general_images/metadata.json"

Multi-resolution with skip_image_resolution

Train the same images at multiple resolutions and exclude small images from high-resolution datasets:

[general]
enable_bucket = true
bucket_no_upscale = true
max_bucket_reso = 1536

[[datasets]]
resolution = 768
  [[datasets.subsets]]
  image_dir = "/data/my_images"

[[datasets]]
resolution = 1024
skip_image_resolution = 768
  [[datasets.subsets]]
  image_dir = "/data/my_images"

[[datasets]]
resolution = 1280
skip_image_resolution = 1024
  [[datasets.subsets]]
  image_dir = "/data/my_images"

Duplicate subset handling

If two subsets in the same dataset point to the same image_dir (DreamBooth) or metadata_file (fine-tuning), the second is ignored. Subsets in different datasets pointing to the same directory are not considered duplicates and are both used — this is how multi-resolution training works.

Command-line arguments overridden by config

When --dataset_config is provided, these command-line arguments are ignored entirely:

--train_data_dir
--reg_data_dir
--in_json

For other overlapping options (e.g. --resolution, --batch_size, --shuffle_caption), the config file value takes priority over the command-line value.

Common errors

Error	Cause
`required key not provided @ data['datasets'][0]['subsets'][0]['image_dir']`	`image_dir` is missing from a subset
`expected int for dictionary value`	A numeric option has the wrong type (e.g. a string instead of a number)
`extra keys not allowed`	An option name is misspelled or not supported at that scope level

Getting Started

Dataset Preparation

LoRA Training

Fine-tuning & Other Methods

Inference & Utilities

Dataset Configuration

Overview

Configuration structure

Complete example

[general] section

[[datasets]] options

Resolution and batching

Aspect ratio bucketing

[[datasets.subsets]] options

Common options

DreamBooth-specific options

Fine-tuning-specific options

Caption dropout options

Dataset style examples

DreamBooth style

Fine-tuning style

Mixed style (DreamBooth + fine-tuning)

Multi-resolution with skip_image_resolution

Duplicate subset handling

Command-line arguments overridden by config

Common errors

Build docs developers (and LLMs) love

Getting Started

Dataset Preparation

LoRA Training

Fine-tuning & Other Methods

Inference & Utilities

​Overview

​Configuration structure

​Complete example

​[general] section

​[[datasets]] options

​Resolution and batching

​Aspect ratio bucketing

​[[datasets.subsets]] options

​Common options

​DreamBooth-specific options

​Fine-tuning-specific options

​Caption dropout options

​Dataset style examples

​DreamBooth style

​Fine-tuning style

​Mixed style (DreamBooth + fine-tuning)

​Multi-resolution with skip_image_resolution

​Duplicate subset handling

​Command-line arguments overridden by config

​Common errors

Build docs developers (and LLMs) love

Overview

Configuration structure

Complete example

[general] section

[[datasets]] options

Resolution and batching

Aspect ratio bucketing

[[datasets.subsets]] options

Common options

DreamBooth-specific options

Fine-tuning-specific options

Caption dropout options

Dataset style examples

DreamBooth style

Fine-tuning style

Mixed style (DreamBooth + fine-tuning)

Multi-resolution with skip_image_resolution

Duplicate subset handling

Command-line arguments overridden by config

Common errors