Skip to main content
The agglomerate module provides functions for performing hierarchical clustering on mesh data using Ward’s method, and for aggregating features across cluster labels.

Functions

multicut_ward

Computes labels from Ward hierarchical clustering at multiple distance thresholds without recomputing the tree at each threshold.
multicut_ward(X, connectivity=None, distance_thresholds=None)
X
array-like
Feature matrix of shape (n_samples, n_features)
connectivity
sparse matrix, optional
Connectivity matrix defining the structure of the data. Neighbors along this structure are merged during clustering.
distance_thresholds
list of float
List of distance thresholds at which to cut the dendrogram
Returns: np.ndarray - Array of shape (n_samples, n_thresholds) containing cluster labels for each threshold

agglomerate_mesh

Performs hierarchical clustering on mesh vertices using their features, handling NaN values appropriately.
agglomerate_mesh(mesh, features, distance_thresholds=None) -> np.ndarray
mesh
tuple
Mesh tuple of (vertices, faces)
features
np.ndarray
Feature array of shape (n_vertices, n_features)
distance_thresholds
list, int, float, or None
Distance threshold(s) for cutting the dendrogram. If None, returns identity labels.
Returns: np.ndarray - Cluster labels array of shape (n_vertices, n_thresholds), or None if all features are NaN Example:
import numpy as np
from meshmash import agglomerate_mesh

# Assume mesh is (vertices, faces)
vertices = np.random.rand(100, 3)
faces = np.random.randint(0, 100, (50, 3))
mesh = (vertices, faces)

# Features for each vertex
features = np.random.rand(100, 10)

# Cluster at multiple thresholds
labels = agglomerate_mesh(mesh, features, distance_thresholds=[0.5, 1.0, 2.0])
print(labels.shape)  # (100, 3)

agglomerate_split_mesh

Agglomerates features on a split mesh using a MeshStitcher.
agglomerate_split_mesh(
    splitter: MeshStitcher,
    features: np.ndarray,
    distance_thresholds: Union[list, int, float]
)
splitter
MeshStitcher
MeshStitcher object that handles mesh splitting and stitching
features
np.ndarray
Feature array for the mesh vertices
distance_thresholds
list, int, or float
Distance threshold(s) for clustering
Returns: np.ndarray - Cluster labels, either 1D (single threshold) or 2D (multiple thresholds)

aggregate_features

Aggregates features according to cluster labels using various aggregation functions.
aggregate_features(features, labels, weights=None, func="mean") -> pd.DataFrame
features
np.ndarray or pd.DataFrame
Feature matrix to aggregate
labels
np.ndarray or None
Cluster labels for each row. If None, returns features unchanged.
weights
np.ndarray, optional
Weights for weighted averaging (only used when func=“mean”)
func
str
default:"mean"
Aggregation function to apply (e.g., “mean”, “sum”, “max”)
Returns: pd.DataFrame - Aggregated features indexed by cluster label Example:
import numpy as np
from meshmash import aggregate_features

features = np.random.rand(100, 5)
labels = np.random.randint(0, 10, 100)

# Mean aggregation
agg_features = aggregate_features(features, labels, func="mean")

# Weighted mean aggregation
weights = np.random.rand(100)
weighted_agg = aggregate_features(features, labels, weights=weights, func="mean")

blow_up_features

Expands aggregated features back to per-vertex features using cluster labels.
blow_up_features(agg_features_df: pd.DataFrame, labels: np.ndarray) -> pd.DataFrame
agg_features_df
pd.DataFrame
Aggregated features indexed by cluster label
labels
np.ndarray
Cluster labels indicating which aggregated feature each vertex should receive
Returns: pd.DataFrame - Expanded features with one row per vertex Example:
import pandas as pd
import numpy as np
from meshmash import blow_up_features

# Aggregated features for 5 clusters
agg_features = pd.DataFrame(np.random.rand(5, 10), index=[0, 1, 2, 3, 4])

# Labels for 100 vertices
labels = np.random.randint(0, 5, 100)

# Expand back to per-vertex features
vertex_features = blow_up_features(agg_features, labels)
print(vertex_features.shape)  # (100, 10)

fix_split_labels

Fixes cluster labels across split meshes by creating unique labels for each submesh-label combination.
fix_split_labels(agg_labels, submesh_mapping)
agg_labels
np.ndarray
Array of cluster labels, potentially with multiple columns for different thresholds
submesh_mapping
np.ndarray
Array mapping each vertex to its submesh index
Returns: np.ndarray - Fixed cluster labels with globally unique values

fix_split_labels_and_features

Fixes both labels and features for split meshes, ensuring consistency between them.
fix_split_labels_and_features(agg_labels, submesh_mapping, features_by_submesh)
agg_labels
np.ndarray
1D array of cluster labels
submesh_mapping
np.ndarray
Array mapping vertices to submesh indices
features_by_submesh
list of pd.DataFrame
List of feature DataFrames, one per submesh
Returns: tuple - (fixed_labels, concatenated_features_df)

Build docs developers (and LLMs) love