Skip to main content

Overview

Classification metrics evaluate the performance of classification models by comparing predicted labels to true labels.

Functions

accuracy

Calculates the accuracy classification score.
function accuracy(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
Returns
number
Accuracy score in range [0, 1], where 1 is perfect accuracy
Accuracy is the fraction of predictions that match the true labels. Time complexity is O(n) where n is the number of samples. Example:
import { accuracy, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const acc = accuracy(yTrue, yPred); // 0.8 (4 out of 5 correct)

precision

Calculates the precision classification score.
function precision(yTrue: Tensor, yPred: Tensor): number
function precision(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function precision(yTrue: Tensor, yPred: Tensor, average: null): number[]
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
average
string | null
Averaging strategy:
  • 'binary': Calculate metrics for positive class only (default)
  • 'micro': Calculate metrics globally by counting total TP, FP, FN
  • 'macro': Calculate metrics for each class, return unweighted mean
  • 'weighted': Calculate metrics for each class, return weighted mean by support
  • null: Return array of scores for each class
Returns
number | number[]
Precision score(s) in range [0, 1]
Precision is the ratio of true positives to all positive predictions. Formula: precision = TP / (TP + FP) Example:
import { precision, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const prec = precision(yTrue, yPred); // 1.0 (2 TP, 0 FP)

recall

Calculates the recall classification score (sensitivity, true positive rate).
function recall(yTrue: Tensor, yPred: Tensor): number
function recall(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function recall(yTrue: Tensor, yPred: Tensor, average: null): number[]
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
average
string | null
Averaging strategy (see precision for options)
Returns
number | number[]
Recall score(s) in range [0, 1]
Recall is the ratio of true positives to all actual positive samples. Formula: recall = TP / (TP + FN) Example:
import { recall, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const rec = recall(yTrue, yPred); // 0.667 (2 out of 3 positives found)

f1Score

Calculates the F1 score (harmonic mean of precision and recall).
function f1Score(yTrue: Tensor, yPred: Tensor): number
function f1Score(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function f1Score(yTrue: Tensor, yPred: Tensor, average: null): number[]
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
average
string | null
Averaging strategy (see precision for options)
Returns
number | number[]
F1 score(s) in range [0, 1]
Formula: F1 = 2 * (precision * recall) / (precision + recall) Example:
import { f1Score, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const f1 = f1Score(yTrue, yPred); // 0.8

fbetaScore

Calculates the F-beta score.
function fbetaScore(yTrue: Tensor, yPred: Tensor, beta: number): number
function fbetaScore(
  yTrue: Tensor,
  yPred: Tensor,
  beta: number,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function fbetaScore(yTrue: Tensor, yPred: Tensor, beta: number, average: null): number[]
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
beta
number
required
Weight of recall vs precision (beta > 1 favors recall, beta < 1 favors precision)
average
string | null
Averaging strategy (see precision for options)
Returns
number | number[]
F-beta score(s) in range [0, 1]
Example:
import { fbetaScore, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const fb2 = fbetaScore(yTrue, yPred, 2); // Favors recall

confusionMatrix

Computes the confusion matrix to evaluate classification accuracy.
function confusionMatrix(yTrue: Tensor, yPred: Tensor): Tensor
yTrue
Tensor
required
Ground truth (correct) target values
yPred
Tensor
required
Estimated targets as returned by a classifier
Returns
Tensor
Confusion matrix as a 2D tensor of shape [n_classes, n_classes]
Example:
import { confusionMatrix, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const cm = confusionMatrix(yTrue, yPred);
// [[2, 0],
//  [1, 2]]

classificationReport

Generates a text classification report showing main classification metrics.
function classificationReport(yTrue: Tensor, yPred: Tensor): string
yTrue
Tensor
required
Ground truth (correct) binary target values (0 or 1)
yPred
Tensor
required
Estimated binary targets as returned by a classifier (0 or 1)
Returns
string
Formatted string report with per-class and aggregate classification metrics
Example:
import { classificationReport, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
console.log(classificationReport(yTrue, yPred));

rocCurve

Computes Receiver Operating Characteristic (ROC) curve for binary classification.
function rocCurve(yTrue: Tensor, yScore: Tensor): [Tensor, Tensor, Tensor]
yTrue
Tensor
required
Ground truth binary labels (must be 0 or 1)
yScore
Tensor
required
Target scores (higher score = more likely positive class)
Returns
[Tensor, Tensor, Tensor]
Tuple of [fpr, tpr, thresholds] tensors

rocAucScore

Area Under ROC Curve (AUC-ROC).
function rocAucScore(yTrue: Tensor, yScore: Tensor): number
yTrue
Tensor
required
Ground truth binary labels (must be 0 or 1)
yScore
Tensor
required
Target scores (higher score = more likely positive class)
Returns
number
AUC score in range [0, 1], or 0.5 if ROC curve cannot be computed
Example:
import { rocAucScore, tensor } from 'deepbox/metrics';

const yTrue = tensor([0, 0, 1, 1]);
const yScore = tensor([0.1, 0.4, 0.35, 0.8]);
const auc = rocAucScore(yTrue, yScore); // ~0.75

precisionRecallCurve

Computes precision-recall pairs for different probability thresholds.
function precisionRecallCurve(yTrue: Tensor, yScore: Tensor): [Tensor, Tensor, Tensor]
yTrue
Tensor
required
Ground truth binary labels (0 or 1)
yScore
Tensor
required
Target scores (higher score = more likely positive class)
Returns
[Tensor, Tensor, Tensor]
Tuple of [precision, recall, thresholds] tensors

averagePrecisionScore

Computes the average precision (AP) from prediction scores.
function averagePrecisionScore(yTrue: Tensor, yScore: Tensor): number
yTrue
Tensor
required
Ground truth binary labels (0 or 1)
yScore
Tensor
required
Target scores (higher score = more likely positive class)
Returns
number
Average precision score in range [0, 1]

logLoss

Log loss (logistic loss, cross-entropy loss).
function logLoss(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth binary labels (0 or 1)
yPred
Tensor
required
Predicted probabilities (must be in range [0, 1])
Returns
number
Log loss value (lower is better, 0 is perfect)

hammingLoss

Computes the fraction of labels that are incorrectly predicted.
function hammingLoss(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth target values
yPred
Tensor
required
Estimated targets as returned by a classifier
Returns
number
Hamming loss in range [0, 1]
Example:
import { hammingLoss, tensor } from 'deepbox/metrics';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const loss = hammingLoss(yTrue, yPred); // 0.2

jaccardScore

Computes the Jaccard similarity coefficient (Intersection over Union).
function jaccardScore(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth binary labels (0 or 1)
yPred
Tensor
required
Predicted binary labels (0 or 1)
Returns
number
Jaccard score in range [0, 1]
Formula: jaccard = TP / (TP + FP + FN)

matthewsCorrcoef

Matthews correlation coefficient (MCC).
function matthewsCorrcoef(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth binary labels (0 or 1)
yPred
Tensor
required
Predicted binary labels (0 or 1)
Returns
number
MCC score in range [-1, 1]

cohenKappaScore

Computes Cohen’s kappa, a statistic that measures inter-annotator agreement.
function cohenKappaScore(yTrue: Tensor, yPred: Tensor): number
yTrue
Tensor
required
Ground truth labels
yPred
Tensor
required
Predicted labels
Returns
number
Kappa score in range [-1, 1]

Build docs developers (and LLMs) love