Overview
A dataset is a collection of labeled or unlabeled data that can be used for training, validation, or testing. Each dataset has:- A unique identifier (
uid) - A human-readable name and URL-friendly slug
- A data type (e.g., image, video, point cloud)
- Optional sequence support for time-series or multi-frame data
- Visibility controls (public/private)
Creating a dataset
Create a new dataset with required metadata:Options
Human-readable name for the dataset
URL-friendly identifier for the dataset
Type of data (e.g., ‘image’, ‘video’, ‘pointcloud’)
Whether this dataset contains sequences
Dataset visibility: ‘public’ or ‘private’
Automatically create metadata storage
Storage provider configuration
Owner username or organization slug
Listing datasets
Retrieve datasets with optional filters:Results are paginated using cursor-based pagination. Use the
nextCursor field to fetch subsequent pages.Filter options
dataType- Filter by data typename- Filter by dataset namestatus- Filter by statusvisibility- Filter by visibility levellimit- Number of results per page (default: 20)cursor- Pagination cursor
Getting a dataset
Retrieve a single dataset by its UID:Working with dataset items
Dataset items represent individual data points within a dataset.Item properties
EachDatasetItem includes:
uid- Unique identifierkey- Item key within the dataseturl- Primary data URLthumbnails- Preview image URLsmetadata- Custom metadata objectannotations- Annotation dataexportSnippet- Export format datacreatedAt/updatedAt- Timestamps
Working with sequences
Sequences are collections of ordered frames, useful for video or time-series data.- List sequences
- Get sequence details
Sequence properties
EachDatasetSequence includes:
uid- Unique identifierkey- Sequence keynumberOfFrames- Frame countviews- Camera or sensor viewsframes- Frame data arraymetrics- Computed metricscropData- Cropping informationpredefinedLabels- Label configurationallowLidarCalibration- LiDAR calibration support
Response types
All dataset operations use these TypeScript interfaces:Best practices
Use descriptive slugs
Choose URL-friendly slugs that clearly identify your dataset’s purpose
Set visibility appropriately
Use ‘private’ for sensitive data and ‘public’ for shared datasets
Leverage metadata
Store custom metadata with items for filtering and organization
Paginate results
Always handle pagination when listing items or sequences