Classification Metrics

Overview

Classification metrics evaluate the performance of classification models by comparing predicted labels to true labels.

Functions

accuracy

Calculates the accuracy classification score.

function accuracy(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

Returns

number

Accuracy score in range [0, 1], where 1 is perfect accuracy

Accuracy is the fraction of predictions that match the true labels. Time complexity is O(n) where n is the number of samples. Example:

import { accuracy, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const acc = accuracy(yTrue, yPred); // 0.8 (4 out of 5 correct)

precision

Calculates the precision classification score.

function precision(yTrue: Tensor, yPred: Tensor): number
function precision(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function precision(yTrue: Tensor, yPred: Tensor, average: null): number[]

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

average

string | null

Averaging strategy:

'binary': Calculate metrics for positive class only (default)
'micro': Calculate metrics globally by counting total TP, FP, FN
'macro': Calculate metrics for each class, return unweighted mean
'weighted': Calculate metrics for each class, return weighted mean by support
null: Return array of scores for each class

Returns

number | number[]

Precision score(s) in range [0, 1]

Precision is the ratio of true positives to all positive predictions. Formula: precision = TP / (TP + FP) Example:

import { precision, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const prec = precision(yTrue, yPred); // 1.0 (2 TP, 0 FP)

recall

Calculates the recall classification score (sensitivity, true positive rate).

function recall(yTrue: Tensor, yPred: Tensor): number
function recall(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function recall(yTrue: Tensor, yPred: Tensor, average: null): number[]

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

average

string | null

Averaging strategy (see precision for options)

Returns

number | number[]

Recall score(s) in range [0, 1]

Recall is the ratio of true positives to all actual positive samples. Formula: recall = TP / (TP + FN) Example:

import { recall, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const rec = recall(yTrue, yPred); // 0.667 (2 out of 3 positives found)

f1Score

Calculates the F1 score (harmonic mean of precision and recall).

function f1Score(yTrue: Tensor, yPred: Tensor): number
function f1Score(
  yTrue: Tensor,
  yPred: Tensor,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function f1Score(yTrue: Tensor, yPred: Tensor, average: null): number[]

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

average

string | null

Averaging strategy (see precision for options)

Returns

number | number[]

F1 score(s) in range [0, 1]

Formula: F1 = 2 * (precision * recall) / (precision + recall) Example:

import { f1Score, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const f1 = f1Score(yTrue, yPred); // 0.8

fbetaScore

Calculates the F-beta score.

function fbetaScore(yTrue: Tensor, yPred: Tensor, beta: number): number
function fbetaScore(
  yTrue: Tensor,
  yPred: Tensor,
  beta: number,
  average: "binary" | "micro" | "macro" | "weighted"
): number
function fbetaScore(yTrue: Tensor, yPred: Tensor, beta: number, average: null): number[]

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

beta

number

required

Weight of recall vs precision (beta > 1 favors recall, beta < 1 favors precision)

average

string | null

Averaging strategy (see precision for options)

Returns

number | number[]

F-beta score(s) in range [0, 1]

Example:

import { fbetaScore, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const fb2 = fbetaScore(yTrue, yPred, 2); // Favors recall

confusionMatrix

Computes the confusion matrix to evaluate classification accuracy.

function confusionMatrix(yTrue: Tensor, yPred: Tensor): Tensor

yTrue

Tensor

required

Ground truth (correct) target values

yPred

Tensor

required

Estimated targets as returned by a classifier

Returns

Tensor

Confusion matrix as a 2D tensor of shape [n_classes, n_classes]

Example:

import { confusionMatrix, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const cm = confusionMatrix(yTrue, yPred);
// [[2, 0],
//  [1, 2]]

classificationReport

Generates a text classification report showing main classification metrics.

function classificationReport(yTrue: Tensor, yPred: Tensor): string

yTrue

Tensor

required

Ground truth (correct) binary target values (0 or 1)

yPred

Tensor

required

Estimated binary targets as returned by a classifier (0 or 1)

Returns

string

Formatted string report with per-class and aggregate classification metrics

Example:

import { classificationReport, tensor } from 'deepbox/core';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
console.log(classificationReport(yTrue, yPred));

rocCurve

Computes Receiver Operating Characteristic (ROC) curve for binary classification.

function rocCurve(yTrue: Tensor, yScore: Tensor): [Tensor, Tensor, Tensor]

yTrue

Tensor

required

Ground truth binary labels (must be 0 or 1)

yScore

Tensor

required

Target scores (higher score = more likely positive class)

Returns

[Tensor, Tensor, Tensor]

Tuple of [fpr, tpr, thresholds] tensors

rocAucScore

Area Under ROC Curve (AUC-ROC).

function rocAucScore(yTrue: Tensor, yScore: Tensor): number

yTrue

Tensor

required

Ground truth binary labels (must be 0 or 1)

yScore

Tensor

required

Target scores (higher score = more likely positive class)

Returns

number

AUC score in range [0, 1], or 0.5 if ROC curve cannot be computed

Example:

import { rocAucScore, tensor } from 'deepbox/metrics';

const yTrue = tensor([0, 0, 1, 1]);
const yScore = tensor([0.1, 0.4, 0.35, 0.8]);
const auc = rocAucScore(yTrue, yScore); // ~0.75

precisionRecallCurve

Computes precision-recall pairs for different probability thresholds.

function precisionRecallCurve(yTrue: Tensor, yScore: Tensor): [Tensor, Tensor, Tensor]

yTrue

Tensor

required

Ground truth binary labels (0 or 1)

yScore

Tensor

required

Target scores (higher score = more likely positive class)

Returns

[Tensor, Tensor, Tensor]

Tuple of [precision, recall, thresholds] tensors

averagePrecisionScore

Computes the average precision (AP) from prediction scores.

function averagePrecisionScore(yTrue: Tensor, yScore: Tensor): number

yTrue

Tensor

required

Ground truth binary labels (0 or 1)

yScore

Tensor

required

Target scores (higher score = more likely positive class)

Returns

number

Average precision score in range [0, 1]

logLoss

Log loss (logistic loss, cross-entropy loss).

function logLoss(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth binary labels (0 or 1)

yPred

Tensor

required

Predicted probabilities (must be in range [0, 1])

Returns

number

Log loss value (lower is better, 0 is perfect)

hammingLoss

Computes the fraction of labels that are incorrectly predicted.

function hammingLoss(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth target values

yPred

Tensor

required

Estimated targets as returned by a classifier

Returns

number

Hamming loss in range [0, 1]

Example:

import { hammingLoss, tensor } from 'deepbox/metrics';

const yTrue = tensor([0, 1, 1, 0, 1]);
const yPred = tensor([0, 1, 0, 0, 1]);
const loss = hammingLoss(yTrue, yPred); // 0.2

jaccardScore

Computes the Jaccard similarity coefficient (Intersection over Union).

function jaccardScore(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth binary labels (0 or 1)

yPred

Tensor

required

Predicted binary labels (0 or 1)

Returns

number

Jaccard score in range [0, 1]

Formula: jaccard = TP / (TP + FP + FN)

matthewsCorrcoef

Matthews correlation coefficient (MCC).

function matthewsCorrcoef(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth binary labels (0 or 1)

yPred

Tensor

required

Predicted binary labels (0 or 1)

Returns

number

MCC score in range [-1, 1]

cohenKappaScore

Computes Cohen’s kappa, a statistic that measures inter-annotator agreement.

function cohenKappaScore(yTrue: Tensor, yPred: Tensor): number

yTrue

Tensor

required

Ground truth labels

yPred

Tensor

required

Predicted labels

Returns

number

Kappa score in range [-1, 1]

NDArray

DataFrame

Linear Algebra

Statistics

Machine Learning

Neural Networks

Optimization

Preprocessing

Metrics

Random

Plotting

Datasets

Classification Metrics

Overview

Functions

accuracy

precision

recall

f1Score

fbetaScore

confusionMatrix

classificationReport

rocCurve

rocAucScore

precisionRecallCurve

averagePrecisionScore

logLoss

hammingLoss

jaccardScore

matthewsCorrcoef

cohenKappaScore

Build docs developers (and LLMs) love

NDArray

DataFrame

Linear Algebra

Statistics

Machine Learning

Neural Networks

Optimization

Preprocessing

Metrics

Random

Plotting

Datasets

​Overview

​Functions

​accuracy

​precision

​recall

​f1Score

​fbetaScore

​confusionMatrix

​classificationReport

​rocCurve

​rocAucScore

​precisionRecallCurve

​averagePrecisionScore

​logLoss

​hammingLoss

​jaccardScore

​matthewsCorrcoef

​cohenKappaScore

Build docs developers (and LLMs) love

Overview

Functions

accuracy

precision

recall

f1Score

fbetaScore

confusionMatrix

classificationReport

rocCurve

rocAucScore

precisionRecallCurve

averagePrecisionScore

logLoss

hammingLoss

jaccardScore

matthewsCorrcoef

cohenKappaScore