Synopsis
Description
Thedvc add command is used to start tracking data files or directories with DVC. When you add a file or directory, DVC:
- Computes the hash of the file/directory contents
- Moves the data to the DVC cache (unless
--no-commitis used) - Creates a
.dvcfile that references the cached data - Adds the original file to
.gitignore(so Git doesn’t track it)
.dvc files should be committed to Git, while the actual data files remain in your workspace but are linked to the cache.
DVC uses file links (reflinks, hardlinks, or symlinks depending on your system) to avoid duplicating data between the cache and workspace.
Options
Input files or directories to add. You can specify multiple targets separated by spaces.
Don’t put files/directories into cache. Only creates the
.dvc file without moving data to the cache.Allows targets containing shell-style wildcards (e.g.,
*.csv, data/**/*.txt).Destination path to put files to. This option changes where the output file is created.
Download it directly to the remote storage instead of to the local cache.
This is useful for handling large files that don’t fit in your local cache. The file is tracked by DVC but stored only in remote storage.
Remote storage to download to. Only used with
--to-remote.Number of jobs to run simultaneously when pushing data to remote. Only used with
--to-remote.Override local file or folder if it exists.
Don’t recreate links from cache to workspace after adding.
Examples
Basic usage
Track a single data file:data/raw.csv.dvc and adds data/raw.csv to .gitignore.
Track a directory
Track an entire directory of data:data/images.dvc that tracks all files in the directory.
Track multiple files
Add multiple files at once:Using wildcards
Track all CSV files in a directory:Add without committing to cache
Create.dvc file without moving data to cache:
dvc commit.
Add directly to remote storage
For very large files, add directly to remote storage:Example workflow
A typical workflow when adding data:Related commands
dvc commit- Record changes to tracked filesdvc push- Upload tracked files to remote storagedvc checkout- Checkout data files from cachedvc remove- Stop tracking files/directories