Overview
Remote storage allows you to store your data, models, and pipeline outputs outside your Git repository. This enables team collaboration, backup, and access from different machines or environments.
DVC supports many storage types: Amazon S3, Google Cloud Storage, Azure Blob Storage, SSH, HTTP, and more.
Setting Up Remote Storage
Add a remote
Configure a remote storage location: dvc remote add -d myremote s3://my-bucket/dvc-storage
The -d flag sets this as the default remote. You can add multiple remotes and switch between them as needed.
Configure credentials
Set up authentication for your storage: AWS S3
Google Cloud Storage
Azure Blob Storage
SSH
# Using AWS CLI profiles
dvc remote modify myremote profile myprofile
# Or set credentials directly (not recommended)
dvc remote modify myremote access_key_id YOUR_KEY
dvc remote modify myremote secret_access_key YOUR_SECRET
# Using service account
dvc remote modify myremote credentialpath ~/.config/gcloud/credentials.json
# Or using project ID
dvc remote modify myremote projectname my-project
dvc remote modify myremote account_name myaccount
dvc remote modify myremote account_key YOUR_KEY
# Using SSH key
dvc remote modify myremote keyfile ~/.ssh/id_rsa
# Or password (less secure)
dvc remote modify myremote password YOUR_PASSWORD
Commit configuration
Save remote configuration to Git: git add .dvc/config
git commit -m "Configure DVC remote storage"
Never commit credentials to Git. Use environment variables or separate credential files.
Supported Storage Types
Amazon S3
Google Cloud Storage
Azure Blob Storage
SSH/SFTP
Local/Network
HTTP/HTTPS
dvc remote add -d myremote s3://bucket/path
Configuration options: dvc remote modify myremote region us-west-2
dvc remote modify myremote profile myprofile
dvc remote modify myremote endpoint_url https://s3.custom-endpoint.com
dvc remote add -d myremote gs://bucket/path
Configuration options: dvc remote modify myremote projectname my-project
dvc remote modify myremote credentialpath /path/to/credentials.json
dvc remote add -d myremote azure://container/path
Configuration options: dvc remote modify myremote account_name myaccount
dvc remote modify myremote account_key mykey
dvc remote modify myremote sas_token mytoken
Configuration options: dvc remote modify myremote port 2222
dvc remote modify myremote keyfile ~/.ssh/id_rsa
dvc remote modify myremote ask_password true
# Local directory
dvc remote add -d myremote /mnt/shared/dvc-storage
# Network share
dvc remote add -d myremote smb://server/share/path
dvc remote add -d myremote https://example.com/dvc-storage
Configuration options: dvc remote modify myremote auth basic
dvc remote modify myremote user myusername
dvc remote modify myremote password mypassword
Pushing and Pulling Data
Push to Remote
Upload tracked data to remote storage:
Push all data
Push specific file
Push to specific remote
Push with multiple jobs
Push all branches
Pull from Remote
Download tracked data from remote storage:
Pull all data
Pull specific file
Pull from specific remote
Pull with dependencies
Pull recursively
Fetch (Download to Cache Only)
Download data to cache without checking out to workspace:
Then checkout when needed:
Use fetch + checkout when you want to download data but not immediately use it in your workspace.
Managing Remotes
List Remotes
Example output:
myremote s3://my-bucket/dvc-storage (default)
backup /mnt/backup/dvc-storage
Set Default Remote
dvc remote default myremote
Modify Remote Settings
# Modify URL
dvc remote modify myremote url s3://new-bucket/path
# Modify options
dvc remote modify myremote region us-east-1
# Unset option
dvc remote modify myremote --unset region
Remove Remote
dvc remote remove myremote
Rename Remote
dvc remote rename myremote production
Remote Configuration Levels
DVC supports three configuration levels:
Project (default)
Local
Global
System
dvc remote add myremote s3://bucket/path
Stored in .dvc/config (committed to Git, shared with team). dvc remote add --local myremote s3://bucket/path
Stored in .dvc/config.local (not committed, machine-specific). dvc remote add --global myremote s3://bucket/path
Stored in ~/.config/dvc/config (applies to all projects). dvc remote add --system myremote s3://bucket/path
Stored in /etc/dvc/config (system-wide configuration).
Store credentials in .dvc/config.local (local level) to avoid committing them to Git.
Advanced Remote Options
Parallel Jobs
Control how many files are transferred simultaneously:
dvc remote modify myremote jobs 16
Or per-command:
Bandwidth Limit
# Limit to 10MB/s
dvc remote modify myremote bandwidth_limit 10485760
Connection Timeout
dvc remote modify myremote timeout 3600
SSL Verification
# Disable SSL verification (not recommended for production)
dvc remote modify myremote ssl_verify false
Custom Endpoint
# For S3-compatible storage
dvc remote modify myremote endpointurl https://minio.example.com
Server-Side Encryption
# For S3
dvc remote modify myremote sse AES256
# For S3 with KMS
dvc remote modify myremote sse aws:kms
dvc remote modify myremote sse_kms_key_id your-kms-key-id
Checking Storage Status
Compare local cache with remote:
Example output:
Data and pipelines are up to date.
Remote storage status:
new: data/train.csv
deleted: models/old_model.pkl
Use dvc status -c to see what needs to be pushed or pulled.
Best Practices
Separate credentials Store credentials in .dvc/config.local (not committed) or use environment variables
Use cloud IAM Prefer IAM roles and instance profiles over access keys when possible
Enable versioning Turn on bucket versioning in S3/GCS to protect against accidental deletions
Set lifecycle policies Configure cloud storage lifecycle rules to archive or delete old data
Use multiple remotes Configure backup remotes for disaster recovery
Optimize transfers Adjust -j (jobs) based on network bandwidth and file count
Complete Examples
AWS S3 Setup
Create S3 bucket
aws s3 mb s3://my-dvc-storage
Configure DVC remote
dvc remote add -d storage s3://my-dvc-storage/project-data
dvc remote modify storage region us-west-2
Set credentials (local)
dvc remote modify --local storage profile myawsprofile
Google Cloud Storage Setup
Create GCS bucket
gsutil mb gs://my-dvc-storage
Configure DVC remote
dvc remote add -d storage gs://my-dvc-storage/project-data
dvc remote modify storage projectname my-gcp-project
Set credentials (local)
dvc remote modify --local storage credentialpath ~/.config/gcloud/credentials.json
Multi-Remote Setup
Configure primary and backup remotes:
# Primary remote (cloud)
dvc remote add -d primary s3://production-bucket/dvc
dvc remote modify primary region us-east-1
# Backup remote (local NAS)
dvc remote add backup /mnt/nas/dvc-backup
# Push to both
dvc push -r primary
dvc push -r backup
Troubleshooting
Check credentials configuration: dvc remote list
dvc config -l
Verify cloud provider credentials: # AWS
aws s3 ls s3://your-bucket/
# GCP
gsutil ls gs://your-bucket/
Increase parallel jobs: Or configure permanently: dvc remote modify myremote jobs 16
Increase timeout: dvc remote modify myremote timeout 7200
Check bucket permissions and IAM policies. Ensure your credentials have:
S3: s3:GetObject, s3:PutObject, s3:ListBucket
GCS: storage.objects.get, storage.objects.create, storage.buckets.list
Next Steps
Collaboration Share data and pipelines with your team using remote storage
Remote Config Explore advanced remote configuration options