Source: dvc/api/data.py:305-330
Description
Returns the complete contents of a file tracked by DVC or Git. This is a convenience function that reads the entire file at once without requiring a context manager. For Git repositories,HEAD is used unless a rev argument is supplied. The default remote is tried unless a remote argument is supplied.
Signature
Parameters
Location and filename of the target file, relative to the root of the repository.
Location of the DVC or Git repository. Defaults to the current project (found by walking up from the current working directory).Can be:
- A URL to a Git repository (HTTP or SSH)
- A local file system path
Noneto use the current repository
Any Git revision such as a branch name, tag name, commit hash, or DVC experiment name.
- Defaults to
HEADfor Git repositories - For local repositories, uses the working directory if not specified
- Ignored if
repois not a Git repository
Name of the DVC remote to use for fetching data. Defaults to the repository’s default remote.For local projects, the cache is checked before the default remote.
Mode in which to open the file. Defaults to
"r" (read text mode).Only reading modes are supported:"r"- Read text mode (returnsstr)"rb"- Read binary mode (returnsbytes)
Text encoding to use (e.g.,
"utf-8", "latin-1"). Only applicable in text mode (mode="r").Mirrors the encoding parameter in Python’s built-in open().DVC config dictionary to pass to the repository.
Remote configuration dictionary to pass to the repository.
Returns
The complete contents of the file:
- Returns
strwhenmode="r"(text mode) - Returns
byteswhenmode="rb"(binary mode)
Raises
Raised when the specified file does not exist in the repository.
Raised when the file is not tracked by DVC.
Raised when a non-read mode is specified.
Examples
Basic Text File Reading
Read Configuration File
Read JSON Metrics
Binary File Reading
Read from Specific Tag
Private Repository with SSH
Read with Custom Encoding
Read NumPy Array
Read from Local Repository
Error Handling
Use Cases
Configuration Loading
Load parameters, configs, or metadata files for experiments.
Small Data Files
Read datasets that fit comfortably in memory.
Model Loading
Load serialized models for inference or evaluation.
Metrics Retrieval
Fetch experiment metrics for analysis and comparison.
Comparison with dvc.api.open()
read() is a convenience wrapper around open() that reads the entire file and returns its contents.| Feature | dvc.api.read() | dvc.api.open() |
|---|---|---|
| Usage | Simple function call | Context manager (with statement) |
| Returns | Complete file contents | File object for streaming |
| Memory | Loads entire file | Streams incrementally |
| Best for | Small files | Large files |
| Code | data = dvc.api.read('file.csv') | with dvc.api.open('file.csv') as f: ... |
Performance Considerations
- Small Files (<10MB)
- Medium Files (10-100MB)
- Large Files (>100MB)
Best Practices
Use for small files only
Use for small files only
read() is ideal for configuration files, parameters, and small datasets:Choose correct mode
Choose correct mode
Use text mode for text files and binary mode for binary data:
Parse returned data appropriately
Parse returned data appropriately
Remember to parse the returned string/bytes:
Handle exceptions properly
Handle exceptions properly
Always catch potential exceptions:
Related Functions
open()
Stream files with context manager
get_url()
Get remote storage URL
DVCFileSystem
Low-level file system access