Skip to main content
The Datasets resource provides methods to create, retrieve, and list datasets, as well as manage dataset items and sequences.

Methods

list

List all datasets with optional filtering.
client.datasets.list(
    data_type=None,
    name=None,
    status=None,
    visibility=None,
    limit=None,
    cursor=None
) -> CursorPage[Dataset]
data_type
str | None
Filter datasets by data type
name
str | None
Filter datasets by name
status
str | None
Filter datasets by status
visibility
str | None
Filter datasets by visibility (e.g., “private”, “public”)
limit
int | None
Maximum number of datasets to return per page
cursor
str | None
Pagination cursor for fetching the next page of results
CursorPage[Dataset]
CursorPage[Dataset]
A paginated list of datasets. Use .items to access the dataset list, .next_cursor for pagination, and .has_more to check if more results exist.

Example

from avala import Avala

client = Avala(api_key="your-api-key")

# List all datasets
page = client.datasets.list()
for dataset in page.items:
    print(dataset.name, dataset.uid)

# Filter by data type and visibility
page = client.datasets.list(
    data_type="image",
    visibility="private",
    limit=10
)

get

Retrieve a specific dataset by its UID.
client.datasets.get(uid: str) -> Dataset
uid
str
required
The unique identifier of the dataset
Dataset
Dataset
The dataset object containing uid, name, slug, item_count, data_type, and timestamps.

Example

dataset = client.datasets.get("dataset-uid-123")
print(f"Dataset: {dataset.name}")
print(f"Items: {dataset.item_count}")

create

Create a new dataset.
client.datasets.create(
    name: str,
    slug: str,
    data_type: str,
    is_sequence=False,
    visibility="private",
    create_metadata=True,
    provider_config=None,
    owner_name=None
) -> Dataset
name
str
required
The human-readable name of the dataset
slug
str
required
The URL-friendly identifier for the dataset
data_type
str
required
The type of data stored in the dataset (e.g., “image”, “video”, “point_cloud”)
is_sequence
bool
default:"False"
Whether the dataset contains sequential data
visibility
str
default:"private"
The visibility level of the dataset (“private” or “public”)
create_metadata
bool
default:"True"
Whether to automatically create metadata for dataset items
provider_config
dict[str, Any] | None
Configuration for the storage provider
owner_name
str | None
The name of the dataset owner (defaults to the authenticated user)
Dataset
Dataset
The newly created dataset object.

Example

dataset = client.datasets.create(
    name="My Image Dataset",
    slug="my-image-dataset",
    data_type="image",
    visibility="private"
)
print(f"Created dataset: {dataset.uid}")

list_items

List all items in a specific dataset.
client.datasets.list_items(
    owner: str,
    slug: str,
    limit=None,
    cursor=None
) -> CursorPage[DatasetItem]
owner
str
required
The owner of the dataset
slug
str
required
The slug of the dataset
limit
int | None
Maximum number of items to return per page
cursor
str | None
Pagination cursor for fetching the next page of results
CursorPage[DatasetItem]
CursorPage[DatasetItem]
A paginated list of dataset items.

Example

page = client.datasets.list_items(
    owner="username",
    slug="my-dataset",
    limit=50
)

for item in page.items:
    print(item.uid, item.url)

get_item

Retrieve a specific item from a dataset.
client.datasets.get_item(
    owner: str,
    slug: str,
    item_uid: str
) -> DatasetItem
owner
str
required
The owner of the dataset
slug
str
required
The slug of the dataset
item_uid
str
required
The unique identifier of the item
DatasetItem
DatasetItem
The dataset item object containing uid, url, metadata, annotations, and related fields.

Example

item = client.datasets.get_item(
    owner="username",
    slug="my-dataset",
    item_uid="item-uid-123"
)
print(item.metadata)

list_sequences

List all sequences in a dataset.
client.datasets.list_sequences(
    owner: str,
    slug: str,
    limit=None,
    cursor=None
) -> CursorPage[DatasetSequence]
owner
str
required
The owner of the dataset
slug
str
required
The slug of the dataset
limit
int | None
Maximum number of sequences to return per page
cursor
str | None
Pagination cursor for fetching the next page of results
CursorPage[DatasetSequence]
CursorPage[DatasetSequence]
A paginated list of dataset sequences.
Sequences are only available for datasets created with is_sequence=True.

Example

page = client.datasets.list_sequences(
    owner="username",
    slug="my-video-dataset"
)

for sequence in page.items:
    print(sequence.uid, sequence.number_of_frames)

get_sequence

Retrieve a specific sequence from a dataset.
client.datasets.get_sequence(
    owner: str,
    slug: str,
    sequence_uid: str
) -> DatasetSequence
owner
str
required
The owner of the dataset
slug
str
required
The slug of the dataset
sequence_uid
str
required
The unique identifier of the sequence
DatasetSequence
DatasetSequence
The dataset sequence object containing uid, frames, views, crop_data, and related fields.

Example

sequence = client.datasets.get_sequence(
    owner="username",
    slug="my-video-dataset",
    sequence_uid="sequence-uid-123"
)
print(f"Frames: {sequence.number_of_frames}")

Async Usage

All methods are available in async form via client.async_datasets:
import asyncio
from avala import AsyncAvala

async def main():
    client = AsyncAvala(api_key="your-api-key")
    
    # List datasets
    page = await client.datasets.list()
    
    # Create dataset
    dataset = await client.datasets.create(
        name="Async Dataset",
        slug="async-dataset",
        data_type="image"
    )
    
    # Get dataset
    dataset = await client.datasets.get("dataset-uid-123")

asyncio.run(main())

Response Types

Dataset

uid
str
Unique identifier for the dataset
name
str
Human-readable name of the dataset
slug
str
URL-friendly identifier
item_count
int
default:"0"
Total number of items in the dataset
data_type
str | None
Type of data in the dataset
created_at
datetime | None
Timestamp when the dataset was created
updated_at
datetime | None
Timestamp when the dataset was last updated

DatasetItem

uid
str
Unique identifier for the item
url
str | None
URL to access the item data
metadata
dict[str, Any] | None
Custom metadata associated with the item
annotations
dict[str, Any] | None
Annotation data for the item
thumbnails
list[str] | None
List of thumbnail URLs

DatasetSequence

uid
str
Unique identifier for the sequence
number_of_frames
int | None
Total number of frames in the sequence
frames
list[dict[str, Any]] | None
List of frame data
views
list[dict[str, Any]] | None
Different views or perspectives of the sequence

CursorPage

items
list[T]
List of items in the current page
next_cursor
str | None
Cursor for fetching the next page (None if no more pages)
previous_cursor
str | None
Cursor for fetching the previous page
has_more
bool
Property indicating if more results are available

Build docs developers (and LLMs) love