Understanding Versions
Dataset versions are created automatically when you:- Create a new dataset - Creates version 1
- Add examples - Creates a new version with added examples
- Modify examples - Creates a new version with changes (future feature)
- A unique version_id
- A created_at timestamp
- Optional description and metadata
- Links to all experiments run on that version
Listing Versions
Retrieve all versions of a dataset:Retrieving Specific Versions
Access any historical version by its version_id:Version Reproducibility
Experiments are permanently linked to specific dataset versions:Exporting Datasets
Export to DataFrame
Convert datasets to pandas DataFrames for analysis or storage:Export to JSON
Export datasets in a portable JSON format:Import from JSON
Restore datasets from JSON exports:Export Specific Versions
Export any version for archival or sharing:Export with Splits
Export specific dataset splits:Working with Version Metadata
While Phoenix automatically creates versions, you can track additional context:Version Cleanup Strategy
While Phoenix keeps all versions, you can implement your own archival strategy:Best Practices
Pin Versions
Reference specific version_ids in production code for reproducibility.
Document Changes
Use descriptive dataset_description when creating versions to track what changed.
Regular Exports
Periodically export important versions to external storage for backup.
Version Comparison
Compare experiments across versions to understand impact of dataset changes.
Next Steps
Creating Datasets
Learn how to create and populate datasets
Running Experiments
Run experiments on your versioned datasets