Overview
DVC experiments let you iterate on your ML models by running your pipeline with different parameters, code changes, or data. Each experiment is tracked automatically, allowing you to compare results and reproduce your best models.
Experiments are Git-based but don’t clutter your repository. They’re stored as lightweight references that you can review, compare, and promote to branches.
Quick Start
Set up your pipeline
First, ensure you have a pipeline with parameters: stages :
train :
cmd : python train.py
deps :
- train.py
- data/train.csv
params :
- train.lr
- train.epochs
outs :
- models/model.pkl
metrics :
- metrics.json :
cache : false
train :
lr : 0.001
epochs : 10
Run your first experiment
Run an experiment with different parameter values: dvc exp run -n "high-lr" -S train.lr= 0.01
This:
Runs your pipeline with lr=0.01
Names the experiment high-lr
Tracks all results automatically
View experiment results
See all experiments and their metrics: You’ll see a table comparing all experiments, their parameters, and metrics.
Running Experiments
With Parameter Changes
Modify parameters on the fly using -S or --set-param:
Single parameter
Multiple parameters
Nested parameters
Named experiment
dvc exp run -S train.lr= 0.01
Use -n or --name to give experiments meaningful names. Otherwise, DVC auto-generates names like exp-a1b2c.
With Code Changes
Make code changes and run experiments without committing:
# Edit your training script
vim train.py
# Run experiment with modified code
dvc exp run -n "new-architecture"
DVC tracks uncommitted code changes in experiments. You can experiment freely without affecting your main branch.
Queue and Run Multiple Experiments
Queue experiments for batch processing:
# Queue experiments
dvc exp run --queue -S train.lr= 0.001
dvc exp run --queue -S train.lr= 0.01
dvc exp run --queue -S train.lr= 0.1
# Run all queued experiments
dvc exp run --run-all
Use -j or --jobs to run experiments in parallel: dvc exp run --run-all -j 4
Run in Temporary Directory
Run experiments without affecting your workspace:
dvc exp run --temp -S train.lr= 0.01
This creates a temporary directory, runs the experiment, and cleans up automatically.
Viewing Experiments
Show All Experiments
Display a table of all experiments:
Example output:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ train.lr ┃ accuracy ┃ loss ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ workspace │ - │ 0.001 │ 0.92 │ 0.234 │
│ ├── exp-high-lr │ 12:34PM │ 0.01 │ 0.89 │ 0.312 │
│ ├── exp-low-lr │ 11:20AM │ 0.0001 │ 0.94 │ 0.187 │
│ main │ - │ 0.001 │ 0.91 │ 0.245 │
└────────────────────┴─────────┴────────────┴──────────┴───────────┘
Filter and Sort
Sort by metric
Show only changed values
Filter columns
All branches
dvc exp show --sort-by accuracy --sort-order desc
dvc exp show --only-changed
# Keep only specific columns
dvc exp show --keep 'train.*'
# Drop specific columns
dvc exp show --drop 'params.dropout'
dvc exp show --all-branches
Export to CSV or JSON
CSV format
JSON format
Markdown table
dvc exp show --csv > experiments.csv
Comparing Experiments
Compare Two Experiments
See the differences between two experiments:
dvc exp diff exp-baseline exp-high-lr
Example output:
Path Metric Value Change
metrics.json accuracy 0.89 -0.03
metrics.json loss 0.312 +0.078
Path Param Value Change
params.yaml train.lr 0.01 +0.009
Compare with Workspace
Compare an experiment to your current workspace:
dvc exp diff exp-baseline
Include All Metrics and Params
dvc exp diff --all exp-baseline exp-high-lr
Managing Experiments
Apply an Experiment
Restore an experiment to your workspace:
This replaces your workspace with the experiment’s code, parameters, and data. Commit or stash changes first.
Create a Branch from an Experiment
Promote a successful experiment to a Git branch:
dvc exp branch exp-low-lr best-model
Now you can:
git checkout best-model
git merge main
Remove Experiments
Remove specific experiment
Remove all experiments
Remove experiments in queue
dvc exp remove exp-failed
Push and Pull Experiments
Share experiments with your team:
Push experiments
Pull experiments
List remote experiments
# Push specific experiment
dvc exp push origin exp-high-lr
# Push all experiments
dvc exp push origin --all
# Pull specific experiment
dvc exp pull origin exp-high-lr
# Pull all experiments
dvc exp pull origin --all
Advanced Workflows
Grid Search
Run experiments with multiple parameter combinations:
# Queue a grid of experiments
for lr in 0.001 0.01 0.1 ; do
for epochs in 10 20 50 ; do
dvc exp run --queue \
-n "lr${ lr }-e${ epochs }" \
-S train.lr= $lr \
-S train.epochs= $epochs
done
done
# Run all queued experiments in parallel
dvc exp run --run-all -j 4
Hyperparameter Tuning
Integrate with your tuning framework:
import dvc.api
import optuna
def objective ( trial ):
lr = trial.suggest_float( 'lr' , 1e-5 , 1e-1 , log = True )
epochs = trial.suggest_int( 'epochs' , 10 , 100 )
# Train model with hyperparameters
accuracy = train_model( lr = lr, epochs = epochs)
return accuracy
study = optuna.create_study( direction = 'maximize' )
study.optimize(objective, n_trials = 100 )
Custom Commit Messages
Add context to experiments:
dvc exp run -m "Testing new data augmentation" -S train.augment= true
Experiment Table Customization
Show Only Stage Dependencies
dvc exp show --param-deps
This shows only parameters that are declared as stage dependencies in dvc.yaml.
Precision Control
dvc exp show --precision 4
Round metrics to 4 decimal places.
Hide Columns
# Hide workspace, queued, or failed experiments
dvc exp show --hide-workspace --hide-queued --hide-failed
Best Practices
Name your experiments Use descriptive names like -n "baseline-model" instead of auto-generated IDs
Track all parameters Declare all hyperparameters in params.yaml for complete experiment tracking
Use queues for batches Queue multiple experiments and run them in parallel with --run-all -j N
Branch successful experiments Promote winning experiments to branches: dvc exp branch exp-name feature-branch
Compare systematically Use dvc exp diff to understand what changed between experiments
Clean up regularly Remove failed experiments to keep your experiment list manageable
Complete Example
Here’s a full workflow:
Baseline experiment
dvc exp run -n "baseline"
Try different learning rates
dvc exp run --queue -n "lr-0.001" -S train.lr= 0.001
dvc exp run --queue -n "lr-0.01" -S train.lr= 0.01
dvc exp run --queue -n "lr-0.1" -S train.lr= 0.1
dvc exp run --run-all -j 3
Compare results
dvc exp show --sort-by accuracy --sort-order desc
Test best configuration
dvc exp apply lr-0.01
dvc exp run -n "final-model" -S train.epochs= 100
Promote to production
dvc exp branch final-model production
git checkout production
git push origin production
Next Steps
Remote Storage Store experiment results and models in remote storage
Collaboration Share experiments and pipelines with your team