Overview
SyftDatasetManager is the primary interface for creating, retrieving, and managing datasets in SyftBox. It handles dataset storage, permissions, and synchronization across datasites.
Constructor
Path to the SyftBox folder on the local filesystem
Email address associated with the datasite
Class Methods
from_config
Create aSyftDatasetManager from an existing SyftBoxConfig.
SyftBox configuration object
Configured dataset manager instance
Methods
create
Create a new dataset with mock and private data.Unique identifier for the dataset. Only alphanumeric characters, underscores, and hyphens are allowed.
Path to the mock data (file or directory) that will be shared publicly
Path to the private data (file or directory) that remains local
Short summary describing the dataset
Path to a markdown README file to include in the dataset
Location identifier for datasets hosted on remote locations requiring manual syncing (e.g., ‘high-side-1234’)
Tags for categorizing and discovering the dataset
Users to share the dataset with. Can be:
- List of email addresses
"any"to share with all usersNone(default) to share with no one
The created Dataset object with metadata and file URLs
ValueError: If dataset name contains invalid charactersFileNotFoundError: If mock_path or readme_path doesn’t existFileExistsError: If dataset directory already exists and is not empty
get
Retrieve a dataset by name.Name of the dataset to retrieve
Email of the datasite owner. Defaults to the current user’s email
The requested Dataset object
FileNotFoundError: If dataset doesn’t exist
get_all
Retrieve all accessible datasets with optional filtering and pagination.Filter datasets by datasite owner email
Maximum number of datasets to return
Number of datasets to skip (for pagination)
Field name to sort by (e.g., “created_at”, “name”)
Sort order: ascending or descending
List of Dataset objects (as a TableList for nice display)
delete
Delete a dataset from the datasite.Name of the dataset to delete
Email of the datasite owner. Defaults to current user. Must be your own datasite.
Whether to prompt for confirmation before deleting
ValueError: If attempting to delete another user’s datasetFileNotFoundError: If dataset doesn’t exist
share_dataset
Share an existing dataset with users.Name of the dataset to share
List of email addresses or “any” to share with all users
ValueError: If dataset doesn’t exist
Special Methods
Indexing
Access datasets by name or index.Iteration
Iterate over all datasets.Length
Get the total number of datasets.Properties
The SyftBox configuration used by this manager
Usage Example
Constants
Default folder name for storing datasets
Filename for dataset metadata
Constant for sharing datasets with all users