Skip to main content

Synopsis

dvc diff [options] [<a_rev>] [<b_rev>] [--targets <paths>...]

Description

The dvc diff command shows differences in DVC-tracked data between:
  • Two Git commits
  • A commit and the current workspace
  • Any two Git references (branches, tags, commits)
This helps you understand what data has changed across different versions of your project, similar to how git diff shows code changes. It’s particularly useful for:
  • Comparing model outputs between experiments
  • Tracking dataset evolution over time
  • Understanding what changed in a specific commit
  • Generating reports on data modifications
The output shows which files were added, deleted, modified, or renamed, optionally with their hash values.
dvc diff only shows structural changes (which files changed) and their hashes. For detailed content differences in metrics or params, use dvc metrics diff or dvc params diff.

Options

a_rev
string
default:"HEAD"
Old Git commit to compare from. Can be a commit hash, branch name, or tag.
dvc diff main
dvc diff abc123
dvc diff v1.0
b_rev
string
New Git commit to compare to. Defaults to the current workspace if not specified.
dvc diff main experiment
dvc diff v1.0 v2.0
--targets
path[]
Specific DVC-tracked files to compare. Accepts one or more file paths.
dvc diff --targets data/train.csv models/model.pkl
--json
boolean
default:"false"
Format the output as JSON. Useful for programmatic parsing.
dvc diff --json HEAD~1
--show-hash
boolean
default:"false"
Display hash values for each entry. Shows first 8 characters of the hash.
dvc diff --show-hash
--md
boolean
default:"false"
Show tabulated output in Markdown format (GitHub Flavored Markdown).
dvc diff --md HEAD~1 > changes.md
Great for including in pull request descriptions or documentation.
--hide-missing
boolean
default:"false"
Hide files that are not in cache. By default, missing files are shown with “not in cache” status.

Examples

Compare workspace with HEAD

See what data changed since the last commit:
dvc diff
Modified:
    d3b07384  data/train.csv
    
Added:
    f98bf6f1  models/new_model.pkl

files summary: 1 modified, 1 added

Compare two commits

Compare data between two specific commits:
dvc diff main experiment
Modified:
    c157a790..f98bf6f1  models/model.pkl
    
Deleted:
    a3c5b23d  data/old_dataset.csv

files summary: 1 modified, 1 deleted

Compare with previous commit

dvc diff HEAD~1
Modified:
    data/processed.csv

files summary: 1 modified

Show hashes

Display hash values to track exact versions:
dvc diff --show-hash HEAD~2
Added:
    d3b07384  models/model_v2.pkl

Modified:
    c157a790..f98bf6f1  data/features.csv

files summary: 1 added, 1 modified

Markdown output

Generate a Markdown table (great for PRs):
dvc diff --md main experiment
| Status   | Path                    |
|----------|-------------------------|
| added    | models/new_model.pkl    |
| modified | data/features.csv       |
| deleted  | data/old_features.csv   |
With hashes:
dvc diff --md --show-hash
| Status   | Hash            | Path                  |
|----------|-----------------|----------------------- |
| added    | d3b07384        | models/model.pkl       |
| modified | c157a790..f98b  | data/train.csv         |

JSON output

Get structured output for scripting:
dvc diff --json HEAD~1
{
  "added": [
    {
      "path": "models/new_model.pkl",
      "hash": "f98bf6f1d9e4c5e2a8b7c9d6e5f4a3b2"
    }
  ],
  "modified": [
    {
      "path": "data/train.csv",
      "hash": {
        "old": "c157a79025e60bcf87d9e4f3c26b8a2f",
        "new": "f98bf6f1d9e4c5e2a8b7c9d6e5f4a3b2"
      }
    }
  ],
  "deleted": [],
  "renamed": []
}

Compare specific files

Diff only specific tracked files:
dvc diff --targets data/train.csv data/test.csv
Modified:
    data/train.csv

files summary: 1 modified

Compare release versions

See what data changed between releases:
dvc diff v1.0 v2.0
Added:
    models/enhanced_model.pkl
    data/augmented/

Modified:
    data/train.csv

Deleted:
    models/baseline.pkl

files summary: 2 added, 1 modified, 1 deleted

Understanding the output

Status types

StatusMeaning
AddedFile was created
ModifiedFile content changed (hash changed)
DeletedFile was removed
RenamedFile was moved to a different path
Not in cacheFile is tracked but missing from cache

Hash display

  • Single hash (e.g., d3b07384): Shows first 8 characters for added/deleted files
  • Hash range (e.g., c157a790..f98bf6f1): Shows old and new hash for modified files

Directory notation

Directories are shown with a trailing slash:
Added:
    data/new_folder/

Example workflows

Workflow 1: Review experiment changes

# Compare current experiment with main branch
dvc diff main

# Get detailed hash info
dvc diff --show-hash main

# Generate report for PR
dvc diff --md main > experiment-changes.md

Workflow 2: Track dataset evolution

# Compare with one week ago
git log --since="1 week ago" --format="%H" | head -1 | xargs dvc diff

# Or compare specific versions
dvc diff v1.0 v2.0 --targets data/

Workflow 3: Validate pipeline outputs

# Run pipeline
dvc repro

# Check what outputs changed
dvc diff HEAD

# If changes look good, commit
git add dvc.lock *.dvc
git commit -m "Update pipeline outputs"

Workflow 4: Generate changelog

# Create a data changelog in Markdown
echo "# Data Changes" > CHANGELOG.md
echo "" >> CHANGELOG.md
echo "## Version 2.0 vs 1.0" >> CHANGELOG.md
dvc diff --md v1.0 v2.0 >> CHANGELOG.md

Combining with Git workflow

DVC diff works alongside Git:
# See code changes
git diff main experiment

# See data changes
dvc diff main experiment

# See both in one view
git diff main experiment
dvc diff main experiment

Handling missing cache files

By default, files not in cache are shown:
Not in cache:
    models/large_model.pkl
To hide these:
dvc diff --hide-missing
Or fetch them first:
dvc fetch --all-commits
dvc diff

Performance considerations

Use targets - Specify --targets to diff only specific files, making the operation faster for large projects.
Compare recent commits - Comparing distant commits may be slower as DVC needs to reconstruct index state.
  • dvc status - Show current workspace status
  • dvc metrics diff - Compare metric values between commits
  • dvc params diff - Compare parameter values between commits
  • dvc plots diff - Compare and visualize plots

Build docs developers (and LLMs) love