Build Your First ML Model
This guide walks you through creating a complete machine learning pipeline using Deepbox - from data preparation to model evaluation.
Create a new project
Set up a new TypeScript project and install Deepbox:mkdir deepbox-quickstart
cd deepbox-quickstart
npm init -y
npm install deepbox tsx typescript
Create a tsconfig.json file:{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"esModuleInterop": true,
"strict": true
}
}
Create your first script
Create a file named index.ts with the following code:import { tensor, add, mean } from "deepbox/ndarray";
import { DataFrame } from "deepbox/dataframe";
import { LinearRegression } from "deepbox/ml";
import { trainTestSplit } from "deepbox/preprocess";
import { r2Score, mse, mae } from "deepbox/metrics";
console.log("Welcome to Deepbox!\n");
// 1. Tensor Operations
console.log("1. Tensor Operations:");
const a = tensor([1, 2, 3, 4, 5]);
const b = tensor([10, 20, 30, 40, 50]);
const c = add(a, b);
console.log("a + b =", c.toString());
console.log("mean(a) =", mean(a).toString());
console.log();
// 2. DataFrame Operations
console.log("2. DataFrame Operations:");
const df = new DataFrame({
name: ["Alice", "Bob", "Charlie"],
age: [25, 30, 35],
score: [85, 90, 78],
});
console.log(df.toString());
console.log();
// 3. Machine Learning Pipeline
console.log("3. Machine Learning Pipeline:");
// Generate synthetic data: y = 2x + 3 + noise
const X_data: number[][] = [];
const y_data: number[] = [];
for (let i = 0; i < 100; i++) {
const x = i / 10;
const y = 2 * x + 3 + (Math.random() - 0.5) * 2;
X_data.push([x]);
y_data.push(y);
}
const X = tensor(X_data);
const y = tensor(y_data);
// Split data: 80% training, 20% testing
const [X_train, X_test, y_train, y_test] = trainTestSplit(X, y, {
testSize: 0.2,
randomState: 42,
});
// Train model
const model = new LinearRegression();
model.fit(X_train, y_train);
// Make predictions
const y_pred = model.predict(X_test);
// Evaluate
const r2 = r2Score(y_test, y_pred);
const mseValue = mse(y_test, y_pred);
const maeValue = mae(y_test, y_pred);
console.log("Model trained successfully!");
console.log("Coefficients:", model.coef?.toString());
console.log("Intercept:", model.intercept);
console.log("\nModel Performance:");
console.log("R² Score:", r2.toFixed(4));
console.log("MSE:", mseValue.toFixed(4));
console.log("MAE:", maeValue.toFixed(4));
Run your script
Execute your script using tsx:You should see output similar to:Welcome to Deepbox!
1. Tensor Operations:
a + b = tensor([11, 22, 33, 44, 55])
mean(a) = 3
2. DataFrame Operations:
name age score
0 Alice 25 85
1 Bob 30 90
2 Charlie 35 78
3. Machine Learning Pipeline:
Model trained successfully!
Coefficients: tensor([2.0123])
Intercept: 2.9876
Model Performance:
R² Score: 0.9987
MSE: 0.3421
MAE: 0.4532
Understanding the Code
Let’s break down what each section does:
Tensor Operations
Tensors are the fundamental data structure in Deepbox:
import { tensor, add, mean } from "deepbox/ndarray";
const a = tensor([1, 2, 3, 4, 5]);
const b = tensor([10, 20, 30, 40, 50]);
const c = add(a, b); // Element-wise addition
Deepbox supports 90+ tensor operations including arithmetic, trigonometric, logical operations, and more.
DataFrame Operations
DataFrames provide a familiar interface for tabular data:
import { DataFrame } from "deepbox/dataframe";
const df = new DataFrame({
name: ["Alice", "Bob", "Charlie"],
age: [25, 30, 35],
score: [85, 90, 78],
});
// DataFrames support filtering, grouping, joining, and more
const filtered = df.filter((row) => row.age > 25);
const grouped = df.groupBy("age").agg({ score: "mean" });
Machine Learning Pipeline
Building ML models follows a familiar scikit-learn-inspired API:
import { LinearRegression } from "deepbox/ml";
import { trainTestSplit } from "deepbox/preprocess";
import { r2Score } from "deepbox/metrics";
// Split data
const [X_train, X_test, y_train, y_test] = trainTestSplit(X, y, {
testSize: 0.2,
randomState: 42,
});
// Train model
const model = new LinearRegression();
model.fit(X_train, y_train);
// Predict and evaluate
const predictions = model.predict(X_test);
const score = r2Score(y_test, predictions);
Next Examples
Automatic Differentiation
Deepbox supports automatic differentiation for gradient-based optimization:
import { parameter } from "deepbox/ndarray";
// Create a parameter (trainable tensor)
const x = parameter([[1, 2], [3, 4]]);
const w = parameter([[0.5], [0.5]]);
// Forward pass
const y = x.matmul(w).sum();
// Backward pass - compute gradients
y.backward();
console.log("x.grad:", x.grad?.toString());
console.log("w.grad:", w.grad?.toString());
Use parameter() to create tensors that track gradients. Call .backward() to compute all gradients automatically.
Neural Network Training
Build and train neural networks with PyTorch-inspired modules:
import { Sequential, Linear, ReLU, Dropout, mseLoss } from "deepbox/nn";
import { Adam } from "deepbox/optim";
// Define network architecture
const model = new Sequential(
new Linear(10, 64),
new ReLU(),
new Dropout(0.2),
new Linear(64, 32),
new ReLU(),
new Linear(32, 1)
);
// Create optimizer
const optimizer = new Adam(model.parameters(), { lr: 0.001 });
// Training loop
for (let epoch = 0; epoch < 100; epoch++) {
// Forward pass
const output = model.forward(xTrain);
const loss = mseLoss(output, yTrain);
// Backward pass
optimizer.zeroGrad();
loss.backward();
optimizer.step();
if (epoch % 10 === 0) {
console.log(`Epoch ${epoch}, Loss: ${loss.item()}`);
}
}
Complete ML Pipeline
Combine preprocessing, model training, and evaluation:
import { trainTestSplit, StandardScaler } from "deepbox/preprocess";
import { RandomForestClassifier } from "deepbox/ml";
import { accuracy, f1Score, confusionMatrix } from "deepbox/metrics";
// Split data
const [X_train, X_test, y_train, y_test] = trainTestSplit(X, y, {
testSize: 0.2,
randomState: 42,
});
// Scale features
const scaler = new StandardScaler();
scaler.fit(X_train);
const X_train_scaled = scaler.transform(X_train);
const X_test_scaled = scaler.transform(X_test);
// Train model
const model = new RandomForestClassifier({
nEstimators: 100,
maxDepth: 10,
});
model.fit(X_train_scaled, y_train);
// Predict and evaluate
const y_pred = model.predict(X_test_scaled);
console.log("Accuracy:", accuracy(y_test, y_pred));
console.log("F1 Score:", f1Score(y_test, y_pred));
console.log("Confusion Matrix:", confusionMatrix(y_test, y_pred));
Explore More
Core Concepts
Learn about tensors, autograd, and broadcasting
Tensor Basics Example
Deep dive into tensor creation and manipulation
ML Models
Explore all available machine learning models
Neural Networks
Build custom neural network architectures
Common Operations Cheat Sheet
import { tensor, zeros, ones, eye, arange, linspace } from "deepbox/ndarray";
const a = tensor([1, 2, 3]); // From array
const b = zeros([3, 3]); // 3x3 zeros
const c = ones([2, 4]); // 2x4 ones
const d = eye(5); // 5x5 identity
const e = arange(0, 10, 2); // [0, 2, 4, 6, 8]
const f = linspace(0, 1, 5); // 5 values from 0 to 1
import { add, mul, matmul, transpose, mean, sum } from "deepbox/ndarray";
const c = add(a, b); // Element-wise addition
const d = mul(a, b); // Element-wise multiplication
const e = matmul(a, b); // Matrix multiplication
const f = transpose(a); // Transpose
const g = mean(a); // Mean
const h = sum(a, { axis: 0 }); // Sum along axis 0
import { DataFrame } from "deepbox/dataframe";
const df = new DataFrame({ col1: [1, 2], col2: [3, 4] });
df.filter((row) => row.col1 > 1) // Filter rows
df.select(["col1"]) // Select columns
df.groupBy("col1").agg({ col2: "mean" }) // Group and aggregate
df.sort("col1", false) // Sort descending
df.join(other, "col1") // Join with another DataFrame
import { LinearRegression } from "deepbox/ml";
import { trainTestSplit } from "deepbox/preprocess";
// Split data
const [X_train, X_test, y_train, y_test] = trainTestSplit(X, y);
// Train
const model = new LinearRegression();
model.fit(X_train, y_train);
// Predict
const predictions = model.predict(X_test);
// Evaluate
const score = model.score(X_test, y_test);
What’s Next? Explore the API Reference for detailed documentation of all available functions and classes.