Skip to main content

Overview

Clustering metrics evaluate the quality of cluster assignments produced by unsupervised learning algorithms.

Functions

silhouetteScore

Computes the mean Silhouette Coefficient over all samples.
function silhouetteScore(
  X: Tensor,
  labels: Tensor,
  metric?: "euclidean" | "precomputed",
  options?: { sampleSize?: number; randomState?: number }
): number
X
Tensor
required
Feature matrix of shape [n_samples, n_features], or a precomputed distance matrix
labels
Tensor
required
Cluster labels for each sample
metric
string
Distance metric: 'euclidean' (default) or 'precomputed'
options.sampleSize
number
Number of samples to use for approximation (required when n > 2000)
options.randomState
number
Seed for reproducible sampling
Returns
number
Mean silhouette coefficient in range [-1, 1]
The Silhouette Coefficient measures how similar a sample is to its own cluster compared to other clusters. Values range from -1 to 1, where higher values indicate better-defined clusters. Time complexity: O(n²) where n is the number of samples.

silhouetteSamples

Computes the Silhouette Coefficient for each sample.
function silhouetteSamples(
  X: Tensor,
  labels: Tensor,
  metric?: "euclidean" | "precomputed"
): Tensor
X
Tensor
required
Feature matrix of shape [n_samples, n_features], or a precomputed distance matrix
labels
Tensor
required
Cluster labels for each sample
metric
string
Distance metric: 'euclidean' (default) or 'precomputed'
Returns
Tensor
Tensor of silhouette coefficients in range [-1, 1] for each sample

daviesBouldinScore

Computes the Davies-Bouldin index.
function daviesBouldinScore(X: Tensor, labels: Tensor): number
X
Tensor
required
Feature matrix of shape [n_samples, n_features]
labels
Tensor
required
Cluster labels for each sample
Returns
number
Davies-Bouldin index (lower is better, 0 is minimum)
The Davies-Bouldin index measures the average similarity ratio of each cluster with its most similar cluster. Lower values indicate better clustering.

calinskiHarabaszScore

Computes the Calinski-Harabasz index (Variance Ratio Criterion).
function calinskiHarabaszScore(X: Tensor, labels: Tensor): number
X
Tensor
required
Feature matrix of shape [n_samples, n_features]
labels
Tensor
required
Cluster labels for each sample
Returns
number
Calinski-Harabasz index (higher is better)
The score is the ratio of between-cluster dispersion to within-cluster dispersion. Higher values indicate better-defined clusters.

adjustedRandScore

Computes the Adjusted Rand Index (ARI).
function adjustedRandScore(labelsTrue: Tensor, labelsPred: Tensor): number
labelsTrue
Tensor
required
Ground truth cluster labels
labelsPred
Tensor
required
Predicted cluster labels
Returns
number
Adjusted Rand Index in range [-1, 1]
The ARI measures the similarity between two clusterings, adjusted for chance. Values: 1 = perfect agreement, 0 = random labeling, less than 0 = worse than random.

adjustedMutualInfoScore

Computes the Adjusted Mutual Information (AMI) between two clusterings.
function adjustedMutualInfoScore(
  labelsTrue: Tensor,
  labelsPred: Tensor,
  averageMethod?: "min" | "geometric" | "arithmetic" | "max"
): number
labelsTrue
Tensor
required
Ground truth cluster labels
labelsPred
Tensor
required
Predicted cluster labels
averageMethod
string
Method to compute the normalizer: 'min', 'geometric', 'arithmetic' (default), or 'max'
Returns
number
Adjusted Mutual Information score, typically in range [0, 1]

normalizedMutualInfoScore

Computes the Normalized Mutual Information (NMI) between two clusterings.
function normalizedMutualInfoScore(
  labelsTrue: Tensor,
  labelsPred: Tensor,
  averageMethod?: "min" | "geometric" | "arithmetic" | "max"
): number
labelsTrue
Tensor
required
Ground truth cluster labels
labelsPred
Tensor
required
Predicted cluster labels
averageMethod
string
Method to compute the normalizer (default: 'arithmetic')
Returns
number
Normalized Mutual Information score in range [0, 1]

fowlkesMallowsScore

Computes the Fowlkes-Mallows Index (FMI).
function fowlkesMallowsScore(labelsTrue: Tensor, labelsPred: Tensor): number
labelsTrue
Tensor
required
Ground truth cluster labels
labelsPred
Tensor
required
Predicted cluster labels
Returns
number
Fowlkes-Mallows score in range [0, 1]
The FMI is the geometric mean of pairwise precision and recall.

homogeneityScore

Computes the homogeneity score of a clustering.
function homogeneityScore(labelsTrue: Tensor, labelsPred: Tensor): number
labelsTrue
Tensor
required
Ground truth class labels
labelsPred
Tensor
required
Predicted cluster labels
Returns
number
Homogeneity score in range [0, 1]
A clustering satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

completenessScore

Computes the completeness score of a clustering.
function completenessScore(labelsTrue: Tensor, labelsPred: Tensor): number
labelsTrue
Tensor
required
Ground truth class labels
labelsPred
Tensor
required
Predicted cluster labels
Returns
number
Completeness score in range [0, 1]
A clustering satisfies completeness if all data points that are members of a given class are assigned to the same cluster.

vMeasureScore

Computes the V-measure score of a clustering.
function vMeasureScore(labelsTrue: Tensor, labelsPred: Tensor, beta?: number): number
labelsTrue
Tensor
required
Ground truth class labels
labelsPred
Tensor
required
Predicted cluster labels
beta
number
Weight of homogeneity vs completeness (default: 1.0)
Returns
number
V-measure score in range [0, 1]
V-measure is the harmonic mean of homogeneity and completeness. When beta > 1, completeness is weighted more; when beta < 1, homogeneity is weighted more.

Build docs developers (and LLMs) love