Skip to main content

Description

The dvc init command initializes a DVC repository in a directory, creating the necessary DVC structure (.dvc directory and configuration files). This is typically the first command you run when starting to use DVC in a project.

Usage

dvc init [directory] [options]
dvc init

Arguments

directory
string
default:"."
Directory to initialize DVC in. Defaults to the current working directory.

Options

--no-scm
flag
Initiate DVC in directory that is not tracked by any SCM tool (e.g. Git).By default, DVC expects to be initialized in a Git repository. Use this flag to use DVC without Git.
-f, --force
flag
Overwrite existing .dvc/ directory. This operation removes local cache.
Using --force will delete your local cache. Make sure you’ve pushed important data to a remote before using this option.
--subdir
flag
Necessary for running this command inside a subdirectory of a parent SCM repository.Allows you to initialize a DVC project in a subdirectory of a Git repository (monorepo structure).

Examples

Initialize in Current Directory

# Navigate to your project directory
cd myproject

# Initialize Git (if not already done)
git init

# Initialize DVC
dvc init
Initializing DVC repository...

You can now commit the changes to git.

What's next?
------------
- Check out the documentation: https://dvc.org/doc
- Get help and share ideas: https://dvc.org/chat
- Star us on GitHub: https://github.com/treeverse/dvc
dvc init creates a .dvc directory with configuration files and adds them to .gitignore.

Initialize Without Git (Standalone)

# Use DVC without Git version control
dvc init --no-scm
This is useful when:
  • You’re using a different version control system
  • You want to use DVC for data management without version control
  • You’re in an environment where Git isn’t available

Reinitialize (Force)

# Remove existing DVC setup and reinitialize
dvc init --force
--force will delete your local cache (.dvc/cache). Ensure you’ve run dvc push to backup data to a remote before forcing reinitialization.

Initialize in Subdirectory (Monorepo)

# In a monorepo structure
cd myrepo/ml-project
dvc init --subdir
Useful for:
  • Monorepo projects with multiple sub-projects
  • Large repositories where only part needs DVC tracking
  • Separate data management for different components

What Gets Created

When you run dvc init, DVC creates the following structure:
.dvc/
├── .gitignore          # Prevents tracking cache and temp files
├── config              # DVC configuration
└── cache/              # Local cache for tracked files (empty initially)
And adds the following to your .gitignore:
.gitignore
/dvc.lock
The .dvc/ directory should be committed to Git (except for cache/ and other ignored items).

Post-Initialization Steps

1

Commit DVC files

git add .dvc .gitignore
git commit -m "Initialize DVC"
2

Configure a remote (optional)

dvc remote add -d storage s3://mybucket/dvc-storage
git add .dvc/config
git commit -m "Configure DVC remote"
3

Start tracking data

dvc add data/dataset.csv
git add data/dataset.csv.dvc data/.gitignore
git commit -m "Track dataset with DVC"

Initialization Checks

Before initializing, verify:
# Check if Git is initialized
git status

# Initialize Git if needed
git init
(Skip if using --no-scm)
# Verify you're in the right directory
pwd
# Check for existing DVC initialization
ls -la .dvc
Use --force if you need to reinitialize.

Use Cases

New ML Project

Start tracking datasets, models, and experiments from day one.

Existing Project

Add data version control to an existing codebase.

Monorepo Setup

Use --subdir to enable DVC in part of a larger repository.

No-Git Workflow

Use --no-scm for DVC-only data management without version control.

Configuration After Init

After initialization, you may want to configure:

Cache Location

# Use external drive for cache
dvc cache dir /mnt/external/dvc-cache

Analytics (Opt-out)

# Disable anonymous usage analytics
dvc config core.analytics false

Autostage

# Automatically stage changes in dvc.lock
dvc config core.autostage true

Troubleshooting

Already Initialized Error

ERROR: '.dvc' directory already exists.
Solution: Use dvc init --force to reinitialize (note: this removes local cache).

Not a Git Repository Error

ERROR: Not inside a Git repository.
Solution: Either run git init first, or use dvc init --no-scm.

Permission Denied

ERROR: Permission denied: '.dvc'
Solution: Check directory permissions or run with appropriate privileges.
  • dvc add - Start tracking data files with DVC
  • dvc remote add - Configure remote storage
  • dvc config - Modify DVC configuration
  • dvc destroy - Remove DVC from project (opposite of init)

Build docs developers (and LLMs) love