Skip to main content

Description

DVC cache commands help you configure and manage the local cache directory where DVC stores all tracked data and model files. The cache enables efficient storage and sharing of large files across your project.

Subcommands

cache dir

Configure cache directory location.
dvc cache dir

Arguments

value
string
Path to cache directory. If no path is provided, it returns the current cache directory.
  • Relative paths are resolved relative to the current directory
  • Path is saved to config relative to the config file location
  • Default location: .dvc/cache within your project

Options

-u, --unset
flag
Unset the custom cache directory and revert to default location (.dvc/cache).
--global
flag
Use global config (~/.config/dvc/config).
--system
flag
Use system config.
--project
flag
Use project config (.dvc/config).
--local
flag
Use local config (.dvc/config.local).

cache migrate

Migrate cached files to the DVC 3.0 cache location.
dvc cache migrate

Options

--dvc-files
flag
Migrate entries in all existing DVC files in the repository to the DVC 3.0 format.
--dry
flag
Only print actions which would be taken without actually migrating any data.

Examples

View Current Cache Location

$ dvc cache dir
/home/user/myproject/.dvc/cache

Move Cache to External Drive

# Set cache to external drive for more storage space
dvc cache dir /mnt/external/dvc-cache
$ dvc cache dir
/mnt/external/dvc-cache
Using an external drive or network storage for cache is useful when working with large datasets that exceed your local disk capacity.

Share Cache Across Multiple Projects

# Set a shared cache directory using global config
dvc cache dir --global /home/user/shared-dvc-cache
This allows multiple DVC projects to share the same cache, saving disk space.

Reset to Default Location

# Remove custom cache directory setting
dvc cache dir --unset

Preview Cache Migration

# See what would be migrated without making changes
dvc cache migrate --dry
Migrating cache from DVC 2.x to 3.0 format:
  - Moving: .dvc/cache/ab/cd1234... -> .dvc/cache/files/md5/ab/cd1234...
  - Moving: .dvc/cache/12/ef5678... -> .dvc/cache/files/md5/12/ef5678...
  
Total: 245 files, 15.3 GB
Use --dry to safely preview migration before committing to changes.

Migrate to DVC 3.0 Format

# Migrate cache structure
dvc cache migrate

# Migrate cache and update all .dvc files
dvc cache migrate --dvc-files
Cache migration is a one-way operation. Make sure to backup important data before migrating, especially when using --dvc-files.

Cache Structure

DVC uses content-addressable storage where files are stored by their hash:
.dvc/cache/
└── files/
    └── md5/
        ├── ab/
        │   └── cd1234567890abcdef1234567890ab  # File content
        └── 12/
            └── ef5678901234abcdef5678901234ef
  • Files are stored using their MD5 hash
  • First 2 characters of hash become directory name
  • Rest of hash is the filename
  • Same file content = same cache entry (deduplication)

Cache Configuration Options

You can configure additional cache behaviors via dvc config:
# Use hardlinks (default, fastest)
dvc config cache.type hardlink

# Use symlinks
dvc config cache.type symlink

# Use reflinks (copy-on-write, requires filesystem support)
dvc config cache.type reflink

# Use copies (slowest, most compatible)
dvc config cache.type copy
Full file copies. Slowest but most compatible. Use when links aren’t available.

Protected Mode

# Prevent accidental modification of cached files
dvc config cache.protected true

Shared Cache

# Allow cache sharing between users (sets proper permissions)
dvc config cache.shared group

Use Cases

External Storage

Move cache to external drive when local disk space is limited.

Shared Team Cache

Configure a network location for cache to enable team collaboration without redundant downloads.

CI/CD Optimization

Use persistent cache directories in CI to speed up pipeline runs.

Version Upgrade

Migrate from DVC 2.x to 3.0 cache format for improved performance.

Cache vs Remote Storage

Cache is local storage on your machine for fast access.Remote is cloud or network storage for backup and sharing.Both work together: cache for speed, remote for collaboration.
FeatureCacheRemote
LocationLocal machineCloud/Network
PurposeFast file accessBackup & sharing
RequiredYesNo (but recommended)
SharedCan be network-mountedYes
Configurationdvc cache dirdvc remote add

Troubleshooting

Check Cache Size

du -sh .dvc/cache

Clear Cache

This will delete all cached files. Only do this if you have a remote backup.
rm -rf .dvc/cache
Run dvc pull to restore files from remote.

Verify Cache Integrity

dvc status
Shows which tracked files are missing from cache.
  • dvc config - Configure DVC settings including cache options
  • dvc gc - Garbage collect unused cache files
  • dvc pull - Download files from remote to cache
  • dvc push - Upload files from cache to remote
  • dvc status - Check status of tracked files and cache

Build docs developers (and LLMs) love