dvc push

Synopsis

dvc push [options] [<targets>...]

Description

The dvc push command uploads DVC-tracked files and directories from your local cache to remote storage (such as S3, GCS, Azure, or SSH storage). This is analogous to git push but for your data files. It ensures that:

Your data is safely backed up in remote storage
Team members can access the data with dvc pull
CI/CD systems can fetch the necessary data
Different environments can sync the same data versions

dvc push only uploads files that don’t already exist in the remote storage, making it efficient for incremental updates.

You must configure a remote storage location before using dvc push. Use dvc remote add to set up a remote.

Options

targets

path

Limit command scope to specific tracked files/directories, .dvc files, or stage names. If not specified, pushes all tracked data.

dvc push data/train.csv models/model.pkl

-r, --remote

string

Remote storage to push to. If not specified, uses the default remote configured in .dvc/config.

dvc push --remote s3storage

-j, --jobs

integer

default:"4 * cpu_count()"

Number of jobs to run simultaneously. Higher values increase parallelism but use more resources.

dvc push --jobs 8

-a, --all-branches

boolean

default:"false"

Push cache for all Git branches. Useful for backing up all experiments.

dvc push --all-branches

This can upload a lot of data if you have many branches with different datasets.

-T, --all-tags

boolean

default:"false"

Push cache for all Git tags.

dvc push --all-tags

-A, --all-commits

boolean

default:"false"

Push cache for all Git commits.

This can be very slow and upload large amounts of data. Use with caution.

-d, --with-deps

boolean

default:"false"

Push cache for all dependencies of the specified target.

dvc push --with-deps train.dvc

-R, --recursive

boolean

default:"false"

Push cache for subdirectories of the specified directory.

dvc push --recursive experiments/

--run-cache

boolean

default:"false"

Push run history for all stages. This includes execution metadata and can help reproduce pipeline runs.

dvc push --run-cache

--glob

boolean

default:"false"

Allows targets containing shell-style wildcards.

dvc push --glob "data/*.csv"

Examples

Basic push

Push all tracked data to the default remote:

dvc push

Everything is up to date.

Or if there are files to push:

2 files pushed

Push specific files

Push only specific targets:

dvc push data/train.csv.dvc

1 file pushed

Push to specific remote

Push to a named remote:

dvc push --remote backup

Push with higher parallelism

Speed up push with more concurrent jobs:

dvc push --jobs 16

Push all branches

Backup data from all branches:

dvc push --all-branches

15 files pushed

This is useful for ensuring all experimental branches are backed up before cleanup.

Push with dependencies

Push a pipeline stage and all its dependencies:

dvc push --with-deps evaluate.dvc

Push with wildcards

dvc push --glob "experiments/exp-*/*.dvc"

Example workflows

Workflow 1: Regular development

# 1. Add or modify data
dvc add data/new_dataset.csv

# 2. Commit to Git
git add data/new_dataset.csv.dvc data/.gitignore
git commit -m "Add new dataset"

# 3. Push data to remote
dvc push

# 4. Push Git commits
git push

Always dvc push before git push to ensure data is backed up before code references are published.

Workflow 2: After running pipeline

# Run your pipeline
dvc repro

# Check what changed
dvc status --cloud

# Push new outputs
dvc push

# Commit pipeline changes
git add dvc.lock
git commit -m "Update pipeline outputs"
git push

Workflow 3: Backup all experiments

# Backup all branch data before cleanup
dvc push --all-branches

# Now safe to delete local branches
git branch -d old-experiment

# Clean local cache
dvc gc --workspace

Workflow 4: Team collaboration

# You: Update dataset
python update_data.py
dvc commit data/dataset.csv.dvc

# Push to remote
dvc push

# Commit and push to Git
git add data/dataset.csv.dvc
git commit -m "Update dataset with new samples"
git push

# Teammate: Pull changes
git pull
dvc pull

Setting up remotes

Before using dvc push, configure a remote:

S3

dvc remote add -d myremote s3://mybucket/path

Google Cloud Storage

dvc remote add -d myremote gs://mybucket/path

Azure Blob Storage

dvc remote add -d myremote azure://mycontainer/path

SSH/SFTP

dvc remote add -d myremote ssh://user@host/path

Local or Network Drive

dvc remote add -d myremote /mnt/shared/dvc-storage

Set as default:

dvc remote default myremote

Checking what needs to be pushed

Before pushing, check status:

dvc status --cloud

new:            data/train.csv
new:            models/model.pkl

This shows files in local cache that haven’t been pushed to remote.

Understanding push output

2 files pushed

Or if everything is synced:

Everything is up to date.

With multiple branches:

main:
        2 files pushed
experiment-1:
        3 files pushed

Error handling

No remote configured

ERROR: no remote provided and no default remote set

Solution: Add a remote storage location:

dvc remote add -d myremote <url>

Authentication errors

ERROR: failed to push data to the cloud

Solution: Configure credentials for your remote storage. Example for S3:

dvc remote modify myremote access_key_id YOUR_ACCESS_KEY
dvc remote modify myremote secret_access_key YOUR_SECRET_KEY

Network issues

If push fails due to network issues, simply run dvc push again. DVC will resume from where it left off.

Performance tips

Increase parallelism - Use --jobs to speed up uploads, especially for many small files:

dvc push --jobs 16

Push specific targets - Instead of pushing everything, push only what changed:

dvc status --cloud  # Check what needs pushing
dvc push data/changed_file.csv.dvc

Use cloud-native storage - For best performance, use storage in the same cloud region as your compute.

Best practices

Always push before git push: Ensure data is backed up before publishing code
Use —all-branches periodically: Backup experiment data before cleaning up branches
Configure credentials securely: Use environment variables or IAM roles instead of storing credentials in config
Monitor costs: Cloud storage and transfer costs can add up with large datasets

dvc pull - Download data from remote storage
dvc fetch - Download to cache without checking out
dvc status - Check sync status with remote
dvc remote - Manage remote storage locations

Overview

Data Management

Pipeline Commands

Experiment Commands

Metrics & Params

Remote Storage

Other Commands

Synopsis

Description

Options

Examples

Basic push

Push specific files

Push to specific remote

Push with higher parallelism

Push all branches

Push with dependencies

Push with wildcards

Example workflows

Workflow 1: Regular development

Workflow 2: After running pipeline

Workflow 3: Backup all experiments

Workflow 4: Team collaboration

Setting up remotes

S3

Google Cloud Storage

Azure Blob Storage

SSH/SFTP

Local or Network Drive

Checking what needs to be pushed

Understanding push output

Error handling

No remote configured

Authentication errors

Network issues

Performance tips

Best practices

Build docs developers (and LLMs) love

Overview

Data Management

Pipeline Commands

Experiment Commands

Metrics & Params

Remote Storage

Other Commands

​Synopsis

​Description

​Options

​Examples

​Basic push

​Push specific files

​Push to specific remote

​Push with higher parallelism

​Push all branches

​Push with dependencies

​Push with wildcards

​Example workflows

​Workflow 1: Regular development

​Workflow 2: After running pipeline

​Workflow 3: Backup all experiments

​Workflow 4: Team collaboration

​Setting up remotes

​S3

​Google Cloud Storage

​Azure Blob Storage

​SSH/SFTP

​Local or Network Drive

​Checking what needs to be pushed

​Understanding push output

​Error handling

​No remote configured

​Authentication errors

​Network issues

​Performance tips

​Best practices

​Related commands

Build docs developers (and LLMs) love

Synopsis

Description

Options

Examples

Basic push

Push specific files

Push to specific remote

Push with higher parallelism

Push all branches

Push with dependencies

Push with wildcards

Example workflows

Workflow 1: Regular development

Workflow 2: After running pipeline

Workflow 3: Backup all experiments

Workflow 4: Team collaboration

Setting up remotes

S3

Google Cloud Storage

Azure Blob Storage

SSH/SFTP

Local or Network Drive

Checking what needs to be pushed

Understanding push output

Error handling

No remote configured

Authentication errors

Network issues

Performance tips

Best practices

Related commands