Skip to main content

Overview

Built-in datasets provide deterministic synthetic data inspired by common ML benchmarks. All datasets return the same values on every call.

Dataset Structure

All loader functions return a Dataset object:
type Dataset = {
  data: Tensor;          // Feature matrix
  target: Tensor;        // Target values
  featureNames: string[];  // Feature names
  targetNames?: string[];  // Class/target names
  description: string;   // Dataset description
  images?: Tensor;       // Optional image data (for vision datasets)
}

Classification Datasets

loadIris

Load the synthetic Iris dataset.
function loadIris(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 4]
  • target: shape [150] (int32)
  • 3 classes: setosa, versicolor, virginica
  • 4 features: sepal length, sepal width, petal length, petal width
Example:
import { loadIris } from 'deepbox/datasets';

const { data, target, featureNames, targetNames } = loadIris();
console.log(data.shape);  // [150, 4]
console.log(target.shape);  // [150]

loadDigits

Load the synthetic Digits dataset.
function loadDigits(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [1797, 64]
  • target: shape [1797] (int32)
  • images: shape [1797, 8, 8]
  • 10 classes: digits 0-9
  • 64 features: 8×8 pixel values (0-15)

loadBreastCancer

Load the synthetic Breast Cancer dataset.
function loadBreastCancer(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [569, 30]
  • target: shape [569] (int32)
  • 2 classes: malignant, benign
  • 30 features: tumor measurements

loadFlowersExtended

Load the synthetic Flowers Extended dataset.
function loadFlowersExtended(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [180, 6]
  • target: shape [180] (int32)
  • 4 species
  • 6 features

loadLeafShapes

Load the synthetic Leaf Shapes dataset.
function loadLeafShapes(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 8]
  • target: shape [150] (int32)
  • 5 plant species
  • 8 geometric features

loadFruitQuality

Load the synthetic Fruit Quality dataset.
function loadFruitQuality(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 5]
  • target: shape [150] (int32)
  • 3 fruit classes
  • 5 features

loadSeedMorphology

Load the synthetic Seed Morphology dataset.
function loadSeedMorphology(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 4]
  • target: shape [150] (int32)
  • 3 seed types
  • 4 features

loadMoonsMulti

Load the synthetic Moons-Multi dataset.
function loadMoonsMulti(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 2]
  • target: shape [150] (int32)
  • 3 interleaving moon classes
  • 2D features

loadConcentricRings

Load the synthetic Concentric Rings dataset.
function loadConcentricRings(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 2]
  • target: shape [150] (int32)
  • 3 concentric circle classes
  • 2D features

loadSpiralArms

Load the synthetic Spiral Arms dataset.
function loadSpiralArms(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 2]
  • target: shape [150] (int32)
  • 3 spiral classes
  • 2D features

loadGaussianIslands

Load the synthetic Gaussian Islands dataset.
function loadGaussianIslands(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 3]
  • target: shape [200] (int32)
  • 4 separated Gaussian clusters
  • 3D features

loadStudentPerformance

Load the synthetic Student Performance dataset.
function loadStudentPerformance(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 3]
  • target: shape [150] (int32)
  • 3 outcome classes: fail, pass, excellent
  • 3 integer features

loadTrafficConditions

Load the synthetic Traffic Conditions dataset.
function loadTrafficConditions(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 3]
  • target: shape [150] (int32)
  • 3 traffic level classes
  • 3 features

loadSensorStates

Load the synthetic Sensor States dataset.
function loadSensorStates(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [180, 6]
  • target: shape [180] (int32)
  • 3 operating modes
  • 6 sensor readings

loadPerfectlySeparable

Load the synthetic Perfectly Separable dataset.
function loadPerfectlySeparable(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [100, 4]
  • target: shape [100] (int32)
  • 2 linearly separable classes
  • 4 features

Regression Datasets

loadDiabetes

Load the synthetic Diabetes regression dataset.
function loadDiabetes(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [442, 10]
  • target: shape [442]
  • Continuous target
  • 10 features

loadLinnerud

Load the synthetic Linnerud multi-output regression dataset.
function loadLinnerud(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [20, 3]
  • target: shape [20, 3]
  • 3 exercise features
  • 3 physiological targets

loadPlantGrowth

Load the synthetic Plant Growth regression dataset.
function loadPlantGrowth(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 3]
  • target: shape [200]
  • Target: height (cm)
  • 3 features: sunlight, water, soil quality

loadHousingMini

Load the synthetic Housing-Mini regression dataset.
function loadHousingMini(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 4]
  • target: shape [200]
  • Target: price (thousands)
  • 4 features: size, rooms, age, distance

loadEnergyEfficiency

Load the synthetic Energy Efficiency regression dataset.
function loadEnergyEfficiency(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 3]
  • target: shape [200]
  • Target: energy usage (kWh)
  • 3 features

loadCropYield

Load the synthetic Crop Yield regression dataset.
function loadCropYield(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 3]
  • target: shape [200]
  • Target: yield (tons/ha)
  • 3 features: rainfall, fertilizer, temperature

loadFitnessScores

Load the synthetic Fitness Scores multi-output regression dataset.
function loadFitnessScores(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [100, 3]
  • target: shape [100, 3]
  • 3 exercise features
  • 3 fitness targets: strength, endurance, flexibility

loadWeatherOutcomes

Load the synthetic Weather Outcomes multi-output regression dataset.
function loadWeatherOutcomes(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [150, 3]
  • target: shape [150, 2]
  • 3 features
  • 2 targets: rain probability, wind speed

Clustering Datasets

loadCustomerSegments

Load the synthetic Customer Segments clustering dataset.
function loadCustomerSegments(): Dataset
Returns
Dataset
Dataset with:
  • data: shape [200, 3]
  • target: shape [200] (int32)
  • 4 natural clusters
  • 3 features: age, income, spending score

Build docs developers (and LLMs) love