Crisp C-Means (also called Hard C-Means or K-Means) is a clustering algorithm that assigns each data point to exactly one cluster. It’s called “crisp” because membership is binary: a point either belongs to a cluster (membership = 1) or it doesn’t (membership = 0).
The algorithm iteratively refines cluster assignments by alternating between assigning points to their nearest centroid and updating centroids based on their assigned points.
The membership matrix U is a C × N binary matrix where:Uij={10if Dij=mink(Dkj)otherwiseEach column sums to 1 (each point belongs to exactly one cluster).
The distance matrix computes Euclidean distances between all points and centroids:
/** * Calculates the distance matrix between each point and each centroid. * * @param points - Array of points. * @param centroids - Array of centroids. * @returns A matrix of distances where each element [i][j] is the distance * between the i-th centroid and the j-th point. */export const getDistanceMatrix = (points: Point[], centroids: Point[]) => { if (points.length === 0 || centroids.length === 0) { return []; } const distanceMatrix = []; for (let i = 0; i < centroids.length; i++) { const distancesRow = []; for (let j = 0; j < points.length; j++) { const newDistance = euclidianDistance(centroids[i], points[j]); distancesRow.push(newDistance); } distanceMatrix.push(distancesRow); } return distanceMatrix;}
/** * Generates the membership matrix from the distance matrix. * * @param distanceMatrix - Matrix of distances between points and centroids. * @returns A membership matrix where each element [i][j] is 1 if the j-th point * belongs to the i-th centroid, otherwise 0. */export const getMembershipMatrix = (distanceMatrix: number[][]) => { if (distanceMatrix.length === 0) { return []; } const membershipMatrix = [...zeroMatrix(distanceMatrix)]; // For each point (column) for (let j = 0; j < distanceMatrix[0].length; j++) { let min = distanceMatrix[0][j]; let minPosition = 0; // Find the nearest centroid (row) for (let i = 0; i < distanceMatrix.length; i++) { if (distanceMatrix[i][j] < min) { min = distanceMatrix[i][j]; minPosition = i; } } // Assign point to nearest centroid membershipMatrix[minPosition][j] = 1; } return membershipMatrix;}
The algorithm starts with a zero matrix and sets exactly one entry to 1 in each column, ensuring each point belongs to only one cluster.
Centroids are recalculated as the mean position of their assigned points:
/** * Calculates new centroids based on the membership matrix. * * @param points - Array of points. * @param membershipMatrix - Membership matrix. * @returns A new array of centroids calculated as the mean of all points * belonging to each centroid. */export const getNewCentroids = (points: Point[], membershipMatrix: number[][]) => { const newCentroids = []; for (let i = 0; i < membershipMatrix.length; i++) { let sumaX = 0; let sumaY = 0; let cardinalidad = 0; // Sum coordinates of all points in this cluster for (let j = 0; j < membershipMatrix[0].length; j++) { if (membershipMatrix[i][j] == 1) { sumaX += points[j].x; sumaY += points[j].y; cardinalidad++; } } // Calculate mean position const x = sumaX / cardinalidad; const y = sumaY / cardinalidad; if (cardinalidad !== 0) { newCentroids.push({ x, y }); } } return newCentroids;}
If a centroid has no assigned points, it’s removed from the clustering. This can happen if a centroid is initialized in a sparse region of the data space.