Installation
The DVC Python API is included with the main DVC package:Core Concepts
The DVC API provides several categories of functions:Data Access
Access DVC-tracked files and their contents from any repository:dvc.api.open()- Stream file contents with context managerdvc.api.read()- Read complete file contentsdvc.api.get_url()- Get remote storage URL for a file
Parameters & Metrics
Retrieve parameters and metrics from your experiments:dvc.api.params_show()- Get parameters from tracking filesdvc.api.metrics_show()- Get metrics from tracking files
Experiments
Access and manage DVC experiments:dvc.api.exp_show()- List and compare experimentsdvc.api.exp_save()- Create new experiments
Artifacts
Work with model registry artifacts:dvc.api.artifacts_show()- Get artifact path and revision
File System
Direct file system access to DVC and Git repositories:dvc.api.DVCFileSystem- Unified file system interface
Quick Start
Common Use Cases
Load Training Data
Stream or read DVC-tracked datasets in your training scripts
Access Parameters
Retrieve hyperparameters from any experiment or branch
Fetch Metrics
Get model performance metrics programmatically
Compare Experiments
Analyze and compare experiment results
Working with Repositories
All API functions support accessing both local and remote repositories:- Current Repository
- Remote Repository
- Local Path
- SSH Repository
Version Control
Access any Git revision (branch, tag, commit) using therev parameter:
For local repositories, omitting
rev will read from the working directory. For remote repositories, it defaults to the default branch.API Reference
Explore the detailed API documentation:open()
Stream file contents
read()
Read complete file
get_url()
Get storage URL
params_show()
Show parameters
metrics_show()
Show metrics
exp_show()
Show experiments
artifacts_show()
Show artifacts
all_branches()
List Git branches
all_commits()
List Git commits
all_tags()
List Git tags
DVCFileSystem
File system API
Error Handling
The API raises specific exceptions that you should handle:Best Practices
Use context managers for large files
Use context managers for large files
When working with large files, use
dvc.api.open() instead of dvc.api.read() to stream data and optimize memory usage:Specify remote for faster access
Specify remote for faster access
If you know which remote contains your data, specify it to avoid trying the default remote:
Cache repository instances
Cache repository instances
When making multiple API calls to the same repository, consider using
DVCFileSystem for better performance:Handle authentication for private repos
Handle authentication for private repos
For private repositories, ensure your Git credentials are configured:
Next Steps
Data Access Guide
Learn about streaming and reading files
Experiments Guide
Work with experiments programmatically
CLI Reference
Explore the command-line interface
Examples
See real-world examples