Synopsis
Description
Thedvc push command uploads DVC-tracked files and directories from your local cache to remote storage (such as S3, GCS, Azure, or SSH storage).
This is analogous to git push but for your data files. It ensures that:
- Your data is safely backed up in remote storage
- Team members can access the data with
dvc pull - CI/CD systems can fetch the necessary data
- Different environments can sync the same data versions
dvc push only uploads files that don’t already exist in the remote storage, making it efficient for incremental updates.
You must configure a remote storage location before using
dvc push. Use dvc remote add to set up a remote.Options
Limit command scope to specific tracked files/directories,
.dvc files, or stage names. If not specified, pushes all tracked data.Remote storage to push to. If not specified, uses the default remote configured in
.dvc/config.Number of jobs to run simultaneously. Higher values increase parallelism but use more resources.
Push cache for all Git branches. Useful for backing up all experiments.
Push cache for all Git tags.
Push cache for all Git commits.
Push cache for all dependencies of the specified target.
Push cache for subdirectories of the specified directory.
Push run history for all stages. This includes execution metadata and can help reproduce pipeline runs.
Allows targets containing shell-style wildcards.
Examples
Basic push
Push all tracked data to the default remote:Push specific files
Push only specific targets:Push to specific remote
Push to a named remote:Push with higher parallelism
Speed up push with more concurrent jobs:Push all branches
Backup data from all branches:This is useful for ensuring all experimental branches are backed up before cleanup.
Push with dependencies
Push a pipeline stage and all its dependencies:Push with wildcards
Example workflows
Workflow 1: Regular development
Workflow 2: After running pipeline
Workflow 3: Backup all experiments
Workflow 4: Team collaboration
Setting up remotes
Before usingdvc push, configure a remote:
S3
Google Cloud Storage
Azure Blob Storage
SSH/SFTP
Local or Network Drive
Checking what needs to be pushed
Before pushing, check status:Understanding push output
Error handling
No remote configured
Authentication errors
Network issues
If push fails due to network issues, simply rundvc push again. DVC will resume from where it left off.
Performance tips
Best practices
- Always push before git push: Ensure data is backed up before publishing code
- Use —all-branches periodically: Backup experiment data before cleaning up branches
- Configure credentials securely: Use environment variables or IAM roles instead of storing credentials in config
- Monitor costs: Cloud storage and transfer costs can add up with large datasets
Related commands
dvc pull- Download data from remote storagedvc fetch- Download to cache without checking outdvc status- Check sync status with remotedvc remote- Manage remote storage locations