Skip to main content
The DVC Python API allows you to access and interact with DVC-tracked data, parameters, metrics, and experiments programmatically from your Python code.

Installation

The DVC Python API is included with the main DVC package:
pip install dvc

Core Concepts

The DVC API provides several categories of functions:

Data Access

Access DVC-tracked files and their contents from any repository:
  • dvc.api.open() - Stream file contents with context manager
  • dvc.api.read() - Read complete file contents
  • dvc.api.get_url() - Get remote storage URL for a file

Parameters & Metrics

Retrieve parameters and metrics from your experiments:
  • dvc.api.params_show() - Get parameters from tracking files
  • dvc.api.metrics_show() - Get metrics from tracking files

Experiments

Access and manage DVC experiments:
  • dvc.api.exp_show() - List and compare experiments
  • dvc.api.exp_save() - Create new experiments

Artifacts

Work with model registry artifacts:
  • dvc.api.artifacts_show() - Get artifact path and revision

File System

Direct file system access to DVC and Git repositories:
  • dvc.api.DVCFileSystem - Unified file system interface

Quick Start

import dvc.api

# Read the entire file
data = dvc.api.read(
    'data/train.csv',
    repo='https://github.com/iterative/example-get-started'
)

Common Use Cases

Load Training Data

Stream or read DVC-tracked datasets in your training scripts

Access Parameters

Retrieve hyperparameters from any experiment or branch

Fetch Metrics

Get model performance metrics programmatically

Compare Experiments

Analyze and compare experiment results

Working with Repositories

All API functions support accessing both local and remote repositories:
import dvc.api

# Automatically uses the current DVC project
data = dvc.api.read('data/model.pkl')

Version Control

Access any Git revision (branch, tag, commit) using the rev parameter:
import dvc.api

# Get data from a specific branch
data_main = dvc.api.read('data.csv', rev='main')

# Get data from a tagged release
data_v1 = dvc.api.read('data.csv', rev='v1.0.0')

# Get data from a specific commit
data_commit = dvc.api.read('data.csv', rev='abc123def')

# Get data from an experiment
data_exp = dvc.api.read('data.csv', rev='exp-random-forest')
For local repositories, omitting rev will read from the working directory. For remote repositories, it defaults to the default branch.

API Reference

Explore the detailed API documentation:

open()

Stream file contents

read()

Read complete file

get_url()

Get storage URL

params_show()

Show parameters

metrics_show()

Show metrics

exp_show()

Show experiments

artifacts_show()

Show artifacts

all_branches()

List Git branches

all_commits()

List Git commits

all_tags()

List Git tags

DVCFileSystem

File system API

Error Handling

The API raises specific exceptions that you should handle:
import dvc.api
from dvc.exceptions import (
    OutputNotFoundError,
    FileMissingError,
    PathMissingError
)

try:
    data = dvc.api.read('data/file.csv', repo='https://github.com/user/repo')
except OutputNotFoundError:
    print("File is not tracked by DVC")
except FileMissingError:
    print("File not found in repository")
except PathMissingError as e:
    print(f"Path missing: {e}")

Best Practices

When working with large files, use dvc.api.open() instead of dvc.api.read() to stream data and optimize memory usage:
with dvc.api.open('large_file.bin') as f:
    for chunk in f:
        process(chunk)
If you know which remote contains your data, specify it to avoid trying the default remote:
data = dvc.api.read('data.csv', remote='myremote')
When making multiple API calls to the same repository, consider using DVCFileSystem for better performance:
from dvc.api import DVCFileSystem

fs = DVCFileSystem(repo='https://github.com/user/repo', rev='main')
with fs.open('file1.csv') as f1:
    data1 = f1.read()
with fs.open('file2.csv') as f2:
    data2 = f2.read()
For private repositories, ensure your Git credentials are configured:
# SSH key setup
ssh-add ~/.ssh/id_rsa

# Or use credentials helper for HTTPS
git config --global credential.helper store

Next Steps

Data Access Guide

Learn about streaming and reading files

Experiments Guide

Work with experiments programmatically

CLI Reference

Explore the command-line interface

Examples

See real-world examples

Build docs developers (and LLMs) love