Synopsis
Description
Thedvc fetch command downloads DVC-tracked files from remote storage to your local cache without updating your workspace. It’s one part of what dvc pull does (the other being dvc checkout).
Use dvc fetch when you want to:
- Pre-download data without immediately checking it out
- Prepare cache for multiple branch checkouts
- Download data for later use
- Populate a shared cache location
- Backup all remote data locally
dvc pull, which both downloads and updates workspace files, dvc fetch only populates the cache. To make the files available in your workspace, you need to run dvc checkout afterward.
Think of
dvc fetch like git fetch - it downloads data but doesn’t change your working directory. Use dvc pull (like git pull) if you want to download and update your workspace in one step.Options
Limit command scope to specific tracked files/directories,
.dvc files, or stage names. If not specified, fetches all tracked data.Remote storage to fetch from. If not specified, uses the default remote configured in
.dvc/config.Number of jobs to run simultaneously. Higher values increase parallelism but use more resources.
Fetch cache for all Git branches. Downloads data for every branch in the repository.
Fetch cache for all Git tags.
Fetch cache for all Git commits.
Fetch cache for all dependencies of the specified target.
Fetch cache for subdirectories of the specified directory.
Fetch run history for all stages.
Fetch only files/directories that are each below specified size in bytes.
Useful for CI/CD environments with limited storage or bandwidth.
Only fetch data files/directories that are of a particular type. Can specify multiple times.Choices:
metrics, plotsExamples
Basic fetch
Fetch all tracked data to local cache:Fetch then checkout
The two-step equivalent ofdvc pull:
Fetch specific files
Fetch only specific targets:Fetch from specific remote
Fetch all branches
Download data for all branches (great for shared caches):Fetch with dependencies
Fetch a pipeline stage and all its dependencies:Fetch small files only
Fetch only files under 50MB:Fetch only metrics and plots
Parallel fetch
Speed up with more jobs:Example workflows
Workflow 1: Shared cache setup
Set up a shared cache for your team:Workflow 2: Branch switching optimization
Pre-fetch data for branches you’ll be working on:Workflow 3: CI/CD with selective fetch
Workflow 4: Disaster recovery
Backup remote storage to local:Workflow 5: Prepare for offline work
Understanding fetch vs pull vs checkout
| Command | Downloads from remote | Updates workspace | Use case |
|---|---|---|---|
dvc fetch | ✓ | ✗ | Pre-download data |
dvc checkout | ✗ | ✓ | Update workspace from cache |
dvc pull | ✓ | ✓ | Download and update |
Visual flow
Example comparison
Using fetch + checkout:
Using pull:
When to use fetch instead of pull
Performance tips
Error handling
Missing remote
Authentication errors
Disk space issues
- Free up space
- Use
--max-sizeto limit what’s fetched - Use
--typeto fetch only specific file types
Related commands
dvc pull- Fetch and checkout in one commanddvc push- Upload data to remote storagedvc checkout- Update workspace from cachedvc status- Check sync status with remotedvc cache- Manage local cache