Skip to main content

Overview

DVC experiments let you iterate on your ML models by running your pipeline with different parameters, code changes, or data. Each experiment is tracked automatically, allowing you to compare results and reproduce your best models.
Experiments are Git-based but don’t clutter your repository. They’re stored as lightweight references that you can review, compare, and promote to branches.

Quick Start

1

Set up your pipeline

First, ensure you have a pipeline with parameters:
dvc.yaml
stages:
  train:
    cmd: python train.py
    deps:
      - train.py
      - data/train.csv
    params:
      - train.lr
      - train.epochs
    outs:
      - models/model.pkl
    metrics:
      - metrics.json:
          cache: false
params.yaml
train:
  lr: 0.001
  epochs: 10
2

Run your first experiment

Run an experiment with different parameter values:
dvc exp run -n "high-lr" -S train.lr=0.01
This:
  • Runs your pipeline with lr=0.01
  • Names the experiment high-lr
  • Tracks all results automatically
3

View experiment results

See all experiments and their metrics:
dvc exp show
You’ll see a table comparing all experiments, their parameters, and metrics.

Running Experiments

With Parameter Changes

Modify parameters on the fly using -S or --set-param:
dvc exp run -S train.lr=0.01
Use -n or --name to give experiments meaningful names. Otherwise, DVC auto-generates names like exp-a1b2c.

With Code Changes

Make code changes and run experiments without committing:
# Edit your training script
vim train.py

# Run experiment with modified code
dvc exp run -n "new-architecture"
DVC tracks uncommitted code changes in experiments. You can experiment freely without affecting your main branch.

Queue and Run Multiple Experiments

Queue experiments for batch processing:
# Queue experiments
dvc exp run --queue -S train.lr=0.001
dvc exp run --queue -S train.lr=0.01
dvc exp run --queue -S train.lr=0.1

# Run all queued experiments
dvc exp run --run-all
Use -j or --jobs to run experiments in parallel: dvc exp run --run-all -j 4

Run in Temporary Directory

Run experiments without affecting your workspace:
dvc exp run --temp -S train.lr=0.01
This creates a temporary directory, runs the experiment, and cleans up automatically.

Viewing Experiments

Show All Experiments

Display a table of all experiments:
dvc exp show
Example output:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
 Experiment Created train.lr accuracy loss
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
 workspace - 0.001 0.92 0.234
 ├── exp-high-lr 12:34PM 0.01 0.89 0.312
 ├── exp-low-lr 11:20AM 0.0001 0.94 0.187
 main - 0.001 0.91 0.245
└────────────────────┴─────────┴────────────┴──────────┴───────────┘

Filter and Sort

dvc exp show --sort-by accuracy --sort-order desc

Export to CSV or JSON

dvc exp show --csv > experiments.csv

Comparing Experiments

Compare Two Experiments

See the differences between two experiments:
dvc exp diff exp-baseline exp-high-lr
Example output:
Path         Metric      Value     Change
metrics.json accuracy    0.89      -0.03
metrics.json loss        0.312     +0.078

Path         Param       Value     Change
params.yaml  train.lr    0.01      +0.009

Compare with Workspace

Compare an experiment to your current workspace:
dvc exp diff exp-baseline

Include All Metrics and Params

dvc exp diff --all exp-baseline exp-high-lr

Managing Experiments

Apply an Experiment

Restore an experiment to your workspace:
dvc exp apply exp-low-lr
This replaces your workspace with the experiment’s code, parameters, and data. Commit or stash changes first.

Create a Branch from an Experiment

Promote a successful experiment to a Git branch:
dvc exp branch exp-low-lr best-model
Now you can:
git checkout best-model
git merge main

Remove Experiments

dvc exp remove exp-failed

Push and Pull Experiments

Share experiments with your team:
# Push specific experiment
dvc exp push origin exp-high-lr

# Push all experiments
dvc exp push origin --all

Advanced Workflows

Run experiments with multiple parameter combinations:
# Queue a grid of experiments
for lr in 0.001 0.01 0.1; do
  for epochs in 10 20 50; do
    dvc exp run --queue \
      -n "lr${lr}-e${epochs}" \
      -S train.lr=$lr \
      -S train.epochs=$epochs
  done
done

# Run all queued experiments in parallel
dvc exp run --run-all -j 4

Hyperparameter Tuning

Integrate with your tuning framework:
train.py
import dvc.api
import optuna

def objective(trial):
    lr = trial.suggest_float('lr', 1e-5, 1e-1, log=True)
    epochs = trial.suggest_int('epochs', 10, 100)
    
    # Train model with hyperparameters
    accuracy = train_model(lr=lr, epochs=epochs)
    
    return accuracy

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

Custom Commit Messages

Add context to experiments:
dvc exp run -m "Testing new data augmentation" -S train.augment=true

Experiment Table Customization

Show Only Stage Dependencies

dvc exp show --param-deps
This shows only parameters that are declared as stage dependencies in dvc.yaml.

Precision Control

dvc exp show --precision 4
Round metrics to 4 decimal places.

Hide Columns

# Hide workspace, queued, or failed experiments
dvc exp show --hide-workspace --hide-queued --hide-failed

Best Practices

Name your experiments

Use descriptive names like -n "baseline-model" instead of auto-generated IDs

Track all parameters

Declare all hyperparameters in params.yaml for complete experiment tracking

Use queues for batches

Queue multiple experiments and run them in parallel with --run-all -j N

Branch successful experiments

Promote winning experiments to branches: dvc exp branch exp-name feature-branch

Compare systematically

Use dvc exp diff to understand what changed between experiments

Clean up regularly

Remove failed experiments to keep your experiment list manageable

Complete Example

Here’s a full workflow:
1

Baseline experiment

dvc exp run -n "baseline"
2

Try different learning rates

dvc exp run --queue -n "lr-0.001" -S train.lr=0.001
dvc exp run --queue -n "lr-0.01" -S train.lr=0.01
dvc exp run --queue -n "lr-0.1" -S train.lr=0.1
dvc exp run --run-all -j 3
3

Compare results

dvc exp show --sort-by accuracy --sort-order desc
4

Test best configuration

dvc exp apply lr-0.01
dvc exp run -n "final-model" -S train.epochs=100
5

Promote to production

dvc exp branch final-model production
git checkout production
git push origin production

Next Steps

Remote Storage

Store experiment results and models in remote storage

Collaboration

Share experiments and pipelines with your team

Build docs developers (and LLMs) love