Dataset API
The dataset module provides lazy, chainable iterables for loading and transforming evaluation data.Import
dataset(source)
Create a dataset from various sources.
Signature
Parameters
source: T[]
Inline array:
source: string
File path (.json, .jsonl, .csv):
source: AsyncIterable<T>
Custom async iterable:
Dataset<T>
The Dataset class implements AsyncIterable<T> and provides chainable transform methods.
map<U>(fn)
Transform each item:
filter(fn)
Exclude items that don’t match a predicate:
limit(n)
Cap the dataset at n items:
shuffle()
Randomize the order of items:
sample(n)
Pick n random items:
pick(indexes)
Select specific items by index:
toArray()
Consume the dataset into a plain array:
Hugging Face Datasets
hf(options)
Load datasets from Hugging Face:
Record Selection
parseRecordSelection(spec)
Parse a record selection string:
0-10— Range from 0 to 10 (inclusive)5— Single index0-10,15,20-25— Multiple ranges and indexes
filterRecordsByIndex(iterable, indexes)
Filter an iterable by index set:
pickFromArray(array, indexes)
Pick specific items from an array:
Types
TransformFn<T, U>
PredicateFn<T>
Examples
Loading from JSON
Chaining Transforms
Custom Async Iterable
Record Selection
Next Steps
Scorers
Learn about scoring functions
Quickstart
Run your first evaluation