Skip to main content
All parameters are passed directly to model.train(). They map to fields on TrainConfig in rfdetr.config.

Basic example

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
)

Core training

dataset_dir
string
required
Path to your dataset directory. RF-DETR auto-detects whether it’s in COCO or YOLO format. See Dataset Formats.
output_dir
string
default:"output"
Directory where training artifacts (checkpoints, logs) are saved.
epochs
integer
default:"100"
Number of full passes over the training dataset.
resume
string
default:"None"
Path to a saved checkpoint to continue training. Restores model weights, optimizer state, and scheduler state.
seed
integer
default:"None"
Global random seed for reproducibility. None means no fixed seed is set.

Batch and memory

batch_size
integer | 'auto'
default:"4"
Number of samples processed per iteration per GPU. Higher values require more GPU memory. Pass "auto" to let RF-DETR probe for the largest safe batch size.
grad_accum_steps
integer
default:"4"
Accumulate gradients over this many mini-batches before an optimizer step. Use with batch_size to achieve a larger effective batch size without increasing memory.
gradient_checkpointing is a model constructor parameter, not a training parameter. Pass it when instantiating the model: RFDETRMedium(gradient_checkpointing=True). See Model Variants for all constructor options.

Understanding batch size

The effective batch size is:
effective_batch_size = batch_size × grad_accum_steps × num_gpus
Recommended configurations targeting an effective batch size of 16:
GPUVRAMbatch_sizegrad_accum_steps
A10040–80 GB161
RTX 409024 GB82
RTX 309024 GB82
T416 GB44
RTX 30708 GB28

Learning rate

lr
float
default:"1e-4"
Learning rate for most parts of the model (excluding the backbone encoder).
lr_encoder
float
default:"1.5e-4"
Learning rate specifically for the backbone encoder. Set lower than lr to fine-tune the encoder more conservatively.
lr_scheduler
string
default:"step"
Learning rate scheduler type. Options: "step" (step decay at lr_drop) or "cosine" (cosine annealing).
lr_min_factor
float
default:"0.0"
Floor for the cosine scheduler, expressed as a fraction of the initial LR. Ignored when using "step".
warmup_epochs
float
default:"0.0"
Number of epochs for linear learning rate warmup at the start of training.
Start with the default values for fine-tuning. If the model doesn’t converge, try reducing lr by half. For small datasets (<1000 images), consider a lower lr (e.g., 5e-5) to prevent overfitting.

Resolution

resolution
integer
Input image resolution. Higher values can improve accuracy but require more memory. Must be divisible by 14. Defaults to the model-specific value.
Common resolution values:
ResolutionMemory usageUse case
560LowSmall objects, limited GPU memory
672MediumBalanced (default for many models)
784HighHigh accuracy requirements
896Very highMaximum quality (requires large GPU)

Regularization

weight_decay
float
default:"1e-4"
L2 regularization coefficient. Helps prevent overfitting by penalizing large weights.
drop_path
float
default:"0.0"
Stochastic depth drop-path rate applied to the backbone. Higher values add more regularization.

EMA (exponential moving average)

use_ema
boolean
default:"true"
Enables Exponential Moving Average of weights. Produces a smoothed checkpoint that often improves final performance and generalization.
EMA maintains a moving average of the model weights throughout training. This smoothed version often generalizes better than the raw weights and is the default for checkpoint_best_total.pth.

Checkpoints

checkpoint_interval
integer
default:"10"
Frequency (in epochs) at which periodic model checkpoints are saved. More frequent saves provide better coverage but consume more storage.
Checkpoint files saved during training:
FileDescription
checkpoint.pthMost recent checkpoint (for resuming)
checkpoint_<N>.pthPeriodic checkpoint at epoch N
checkpoint_best_ema.pthBest validation performance (EMA weights)
checkpoint_best_regular.pthBest validation performance (raw weights)
checkpoint_best_total.pthFinal best model for inference

Early stopping

early_stopping
boolean
default:"false"
Enable early stopping based on validation mAP.
early_stopping_patience
integer
default:"10"
Number of epochs without improvement before stopping training.
early_stopping_min_delta
float
default:"0.001"
Minimum change in mAP to qualify as an improvement.
early_stopping_use_ema
boolean
default:"false"
Whether to track improvements using EMA model metrics.
Example:
model.train(
    dataset_dir="path/to/dataset",
    epochs=200,
    batch_size=4,
    early_stopping=True,
    early_stopping_patience=15,
    early_stopping_min_delta=0.005,
)
This configuration trains for up to 200 epochs and stops early if mAP doesn’t improve by at least 0.005 for 15 consecutive epochs.

Logging and evaluation

tensorboard
boolean
default:"true"
Enable TensorBoard logging. Requires pip install "rfdetr[loggers]". If the package is not installed, training continues with a UserWarning and TensorBoard output is silently suppressed.
wandb
boolean
default:"false"
Enable Weights & Biases logging. Requires pip install "rfdetr[loggers]".
mlflow
boolean
default:"false"
Enable MLflow logging. Requires pip install "rfdetr[loggers]".
project
string
default:"None"
Project name for W&B or MLflow logging.
run
string
default:"None"
Run name for W&B or MLflow logging. If not specified, an auto-generated name is used.
eval_max_dets
integer
default:"500"
Maximum number of detections per image considered during COCO evaluation. Lower values speed up evaluation.
eval_interval
integer
default:"1"
Run COCO evaluation every N epochs. Set to a higher value to reduce evaluation overhead during long training runs.
log_per_class_metrics
boolean
default:"true"
Log per-class AP metrics to the console and loggers. Disable to reduce log verbosity when there are many classes.
progress_bar
string | null
default:"null"
Enable a progress bar during training. Accepts "tqdm", "rich", or null to disable. Also accepts legacy boolean values (true maps to "tqdm").

Data loading

num_workers
integer
default:"2"
Number of DataLoader worker processes for parallel data loading.
pin_memory
boolean
default:"null"
Pin host memory in the DataLoader for faster GPU transfers. None defers to PyTorch Lightning’s default.
persistent_workers
boolean
default:"null"
Keep DataLoader worker processes alive between epochs. None defers to PyTorch Lightning’s default.
prefetch_factor
integer
default:"null"
Number of batches to prefetch per DataLoader worker. None uses PyTorch’s built-in default.

Hardware and runtime

accelerator
string
default:"auto"
PyTorch Lightning accelerator selection. "auto" picks GPU if available, then MPS, then CPU.
fp16_eval
boolean
default:"false"
Run evaluation passes in FP16 precision. Reduces memory usage but may lower numerical precision.
compute_val_loss
boolean
default:"true"
Compute and log the detection loss on the validation set each epoch.
compute_test_loss
boolean
default:"true"
Compute and log the detection loss during the final test run.

Auto-batch configuration

These parameters control the automatic batch size detection when batch_size="auto":
ParameterDefaultDescription
auto_batch_target_effective16Per-device effective batch size target before scaling by devices × num_nodes.
auto_batch_max_targets_per_image100Worst-case number of annotations per image used when probing for a safe batch size.
auto_batch_ema_headroom0.7Scale the safe batch size by this factor when use_ema=True, since EMA uses extra memory. Must be in (0, 1].

Deprecated fields

The following fields exist on TrainConfig but are deprecated and will be removed in v1.9. Set them on ModelConfig instead.
  • group_detr — query group count is an architecture decision; set on ModelConfig
  • ia_bce_loss — loss type is tied to the architecture family; set on ModelConfig
  • segmentation_head — architecture flag; set on ModelConfig
  • num_select — postprocessor count is an architecture decision; set on ModelConfig

Complete reference table

ParameterTypeDefaultDescription
dataset_dirstrRequiredPath to COCO or YOLO formatted dataset.
output_dirstr"output"Directory for checkpoints, logs, and artifacts.
epochsint100Number of full passes over the dataset.
batch_sizeint | "auto"4Samples per iteration. Balance with grad_accum_steps.
grad_accum_stepsint4Gradient accumulation steps for effective larger batch sizes.
lrfloat1e-4Learning rate for the model (excluding encoder).
lr_encoderfloat1.5e-4Learning rate for the backbone encoder.
resolutionintModel-specificInput image size (must be divisible by 14).
weight_decayfloat1e-4L2 regularization coefficient.
use_emaboolTrueEnable Exponential Moving Average of weights.
gradient_checkpointingboolFalseModel constructor param — pass to RFDETRMedium(gradient_checkpointing=True).
checkpoint_intervalint10Save checkpoint every N epochs.
resumestrNonePath to checkpoint for resuming training.
tensorboardboolTrueEnable TensorBoard logging.
wandbboolFalseEnable Weights & Biases logging.
mlflowboolFalseEnable MLflow logging.
projectstrNoneW&B or MLflow project name.
runstrNoneW&B or MLflow run name.
early_stoppingboolFalseEnable early stopping.
early_stopping_patienceint10Epochs without improvement before stopping.
early_stopping_min_deltafloat0.001Minimum mAP change to qualify as improvement.
early_stopping_use_emaboolFalseUse EMA model for early stopping metrics.
eval_max_detsint500Maximum detections per image for COCO evaluation.
eval_intervalint1Run COCO evaluation every N epochs.
log_per_class_metricsboolTrueLog per-class AP metrics.
progress_barstr | NoneNoneProgress bar style: "tqdm", "rich", or None.
num_workersint2DataLoader worker processes.
acceleratorstr"auto"PyTorch Lightning accelerator.
seedintNoneRandom seed for reproducibility.
lr_schedulerstr"step"LR scheduler type: "step" or "cosine".
lr_min_factorfloat0.0Cosine scheduler LR floor as a fraction of initial LR.
warmup_epochsfloat0.0Linear warmup epochs at start of training.
drop_pathfloat0.0Stochastic depth drop-path rate for the backbone.
compute_val_lossboolTrueCompute and log loss during validation.
compute_test_lossboolTrueCompute and log loss during the test run.
fp16_evalboolFalseRun evaluation in FP16 precision.
pin_memoryboolNonePin DataLoader memory.
persistent_workersboolNoneKeep DataLoader workers alive between epochs.
prefetch_factorintNoneBatches prefetched per worker.
aug_configdictNoneCustom augmentation config. See Custom Augmentations.

Build docs developers (and LLMs) love