Your First Clustering Iteration
This guide will walk you through adding data points, defining centroids, and running the clustering algorithm to see real-time results.Add Data Points
Start by adding some data points to cluster. You have two options:Option 1: Manual EntryOption 2: Random GenerationClick the Random Point button to automatically generate a random point with coordinates between -100 and 100:
- In the Add Point section (left side), enter X and Y coordinates
- Click the Add Point button
- Repeat to add multiple points
Define Initial Centroids
Now add centroids that will serve as the initial cluster centers:
- In the Add Centroid section (right side), enter X and Y coordinates
- Click the Add Centroid button
- Add 2-3 centroids to create distinct clusters
The number of centroids determines the number of clusters (k). For your first try, use 2-3 centroids.
Observe the Initial State
After adding points and centroids, you’ll see:
- Points Table: Lists all your data points with coordinates (4 decimal places)
- Centroids Table: Shows your initial centroid positions
- Scatter Plot: Visualizes points and centroids on a Chart.js graph
- Cost Function: Displays as 0.0000 before the first iteration
- Distance Matrix: Empty until you iterate
- Membership Matrix: Empty until you iterate
Run the First Iteration
Click the Iterate button to execute one step of the fuzzy C-Means algorithm.Behind the scenes, the The algorithm performs these calculations:
useCMeans hook executes:- Distance Matrix: Calculates Euclidean distance between each point and centroid
- Membership Matrix: Computes fuzzy membership values (soft assignment)
- New Centroids: Recalculates centroid positions based on weighted membership
- Cost Function: Sums the weighted squared distances
The default algorithm is fuzzy C-Means with fuzzification parameter m=2. This is set in src/App.tsx:16.
Analyze the Results
After the first iteration, examine the updated interface:Distance MatrixShows Euclidean distances calculated by:Membership MatrixFor fuzzy C-Means, values are continuous (0 to 1) representing partial membership:Updated Scatter PlotPoints are now color-coded by their dominant cluster membership, and centroids have moved to new positions.Cost FunctionThe cost function shows the total clustering error:
Continue Iterating
Keep clicking Iterate to run additional iterations. Watch for:
- Centroid movement: Centroids shift toward the center of their clusters
- Decreasing cost: The cost function should decrease with each iteration
- Stabilization: Eventually, centroids stop moving and the cost plateaus (convergence)
Understanding the Algorithm Flow
TheuseCMeans hook orchestrates the entire clustering process:
Crisp vs. Fuzzy Comparison
Crisp C-Means (Hard Clustering)
- Membership: Binary (0 or 1) - each point belongs to exactly one cluster
- Centroid Update: Simple mean of assigned points
- Use Case: When clusters are well-separated and non-overlapping
Fuzzy C-Means (Soft Clustering)
- Membership: Continuous (0 to 1) - points can partially belong to multiple clusters
- Centroid Update: Weighted mean using membership degrees raised to power m
- Use Case: When clusters overlap or boundaries are unclear
Common Patterns and Tips
Achieving Good Convergence
- Initial Centroids: Place them in different regions of your data space
- Sufficient Points: Use at least 3-4 points per cluster
- Iterations: Run until the cost function change is minimal (< 0.01)
Monitoring Convergence
Watch the cost function displayed at the top of the interface:Handling Edge Cases
If you see an error notification: This means you need to add at least one point and one centroid before clicking Iterate.Next Steps
Now that you understand the basics:- Explore the detailed algorithm implementations in the API Reference
- Learn about component architecture in the Components section
- Customize the fuzzification parameter or add algorithm switching in the Advanced Usage guide