Coreset Sampling

Coreset sampling is a critical component of PatchCore that reduces the memory bank size from potentially millions of patch features to a manageable subset, while preserving the representation quality. This enables efficient storage and fast inference without sacrificing detection performance.

Why Coreset Sampling?

Without subsampling, the memory requirements would be prohibitive:

Example: For a dataset with 200 training images at 224×224 resolution:

Patches per image: 56 × 56 = 3,136
Total patches: 200 × 3,136 = 627,200 patches
With 1024-dim features: ~2.4 GB of memory

After 10% coreset sampling:

Sampled patches: ~62,720 patches
Memory: ~240 MB (10× reduction)

Sampling Strategies

PatchCore provides three sampling strategies:

Identity Sampler

No sampling - uses all features

Random Sampler

Random subset selection

Greedy Coreset

Intelligent diversity-based selection

IdentitySampler

The simplest sampler that returns all features unchanged:

sampler.py

class IdentitySampler:
    def run(
        self, features: Union[torch.Tensor, np.ndarray]
    ) -> Union[torch.Tensor, np.ndarray]:
        return features

Use IdentitySampler for small datasets or when memory is not a constraint.

RandomSampler

Randomly selects a percentage of features:

sampler.py

class RandomSampler(BaseSampler):
    def __init__(self, percentage: float):
        super().__init__(percentage)
    
    def run(
        self, features: Union[torch.Tensor, np.ndarray]
    ) -> Union[torch.Tensor, np.ndarray]:
        """Randomly samples input feature collection.
        
        Args:
            features: [N x D]
        """
        num_random_samples = int(len(features) * self.percentage)
        subset_indices = np.random.choice(
            len(features), num_random_samples, replace=False
        )
        subset_indices = np.array(subset_indices)
        return features[subset_indices]

Random sampling is fast but may miss important feature diversity. Use it as a baseline comparison.

GreedyCoresetSampler

The recommended sampling strategy that selects a diverse subset using a greedy farthest-point algorithm:

sampler.py

class GreedyCoresetSampler(BaseSampler):
    def __init__(
        self,
        percentage: float,
        device: torch.device,
        dimension_to_project_features_to=128,
    ):
        """Greedy Coreset sampling base class."""
        super().__init__(percentage)
        
        self.device = device
        self.dimension_to_project_features_to = dimension_to_project_features_to

Dimensionality Reduction for Speed

To accelerate distance computations, features are projected to a lower dimension:

sampler.py

def _reduce_features(self, features):
    if features.shape[1] == self.dimension_to_project_features_to:
        return features
    mapper = torch.nn.Linear(
        features.shape[1], self.dimension_to_project_features_to, bias=False
    )
    _ = mapper.to(self.device)
    features = features.to(self.device)
    return mapper(features)

The random projection is only used for distance computation during sampling. The original high-dimensional features are returned for the memory bank.

Core Algorithm

The greedy coreset selection algorithm:

sampler.py

def _compute_greedy_coreset_indices(self, features: torch.Tensor) -> np.ndarray:
    """Runs iterative greedy coreset selection.
    
    Args:
        features: [NxD] input feature bank to sample.
    """
    distance_matrix = self._compute_batchwise_differences(features, features)
    coreset_anchor_distances = torch.norm(distance_matrix, dim=1)
    
    coreset_indices = []
    num_coreset_samples = int(len(features) * self.percentage)
    
    for _ in range(num_coreset_samples):
        select_idx = torch.argmax(coreset_anchor_distances).item()
        coreset_indices.append(select_idx)
        
        coreset_select_distance = distance_matrix[
            :, select_idx : select_idx + 1
        ]
        coreset_anchor_distances = torch.cat(
            [coreset_anchor_distances.unsqueeze(-1), coreset_select_distance], dim=1
        )
        coreset_anchor_distances = torch.min(coreset_anchor_distances, dim=1).values
    
    return np.array(coreset_indices)

Algorithm Breakdown

Initialize

Compute full pairwise distance matrix between all features

Select Farthest Point

Choose the point with maximum distance to already selected points (initially, maximum norm)

Update Distances

For each remaining point, compute its distance to the newly selected point and update minimum distances

Repeat

Continue until desired number of samples is reached

Distance Computation

Efficient batched Euclidean distance calculation:

sampler.py

@staticmethod
def _compute_batchwise_differences(
    matrix_a: torch.Tensor, matrix_b: torch.Tensor
) -> torch.Tensor:
    """Computes batchwise Euclidean distances using PyTorch."""
    a_times_a = matrix_a.unsqueeze(1).bmm(matrix_a.unsqueeze(2)).reshape(-1, 1)
    b_times_b = matrix_b.unsqueeze(1).bmm(matrix_b.unsqueeze(2)).reshape(1, -1)
    a_times_b = matrix_a.mm(matrix_b.T)
    
    return (-2 * a_times_b + a_times_a + b_times_b).clamp(0, None).sqrt()

This uses the identity: ||a - b||² = ||a||² + ||b||² - 2⟨a,b⟩ for efficient computation

ApproximateGreedyCoresetSampler

For very large feature sets, the approximate version trades some optimality for memory efficiency:

sampler.py

class ApproximateGreedyCoresetSampler(GreedyCoresetSampler):
    def __init__(
        self,
        percentage: float,
        device: torch.device,
        number_of_starting_points: int = 10,
        dimension_to_project_features_to: int = 128,
    ):
        """Approximate Greedy Coreset sampling base class."""
        self.number_of_starting_points = number_of_starting_points
        super().__init__(percentage, device, dimension_to_project_features_to)

Approximate Algorithm

sampler.py

def _compute_greedy_coreset_indices(self, features: torch.Tensor) -> np.ndarray:
    """Runs approximate iterative greedy coreset selection.
    
    This greedy coreset implementation does not require computation of the
    full N x N distance matrix and thus requires a lot less memory, however
    at the cost of increased sampling times.
    
    Args:
        features: [NxD] input feature bank to sample.
    """
    number_of_starting_points = np.clip(
        self.number_of_starting_points, None, len(features)
    )
    start_points = np.random.choice(
        len(features), number_of_starting_points, replace=False
    ).tolist()
    
    approximate_distance_matrix = self._compute_batchwise_differences(
        features, features[start_points]
    )
    approximate_coreset_anchor_distances = torch.mean(
        approximate_distance_matrix, axis=-1
    ).reshape(-1, 1)
    coreset_indices = []
    num_coreset_samples = int(len(features) * self.percentage)
    
    with torch.no_grad():
        for _ in tqdm.tqdm(range(num_coreset_samples), desc="Subsampling..."):
            select_idx = torch.argmax(approximate_coreset_anchor_distances).item()
            coreset_indices.append(select_idx)
            coreset_select_distance = self._compute_batchwise_differences(
                features, features[select_idx : select_idx + 1]
            )
            approximate_coreset_anchor_distances = torch.cat(
                [approximate_coreset_anchor_distances, coreset_select_distance],
                dim=-1,
            )
            approximate_coreset_anchor_distances = torch.min(
                approximate_coreset_anchor_distances, dim=1
            ).values.reshape(-1, 1)
    
    return np.array(coreset_indices)

Key Differences

Memory Efficiency

Instead of computing the full N×N distance matrix, it:

Starts with distances to a small random subset
Computes distances incrementally as new points are selected
Memory: O(N × k) instead of O(N²) where k = number of selected samples

Time Complexity

GreedyCoresetSampler: O(N²) distance computation + O(N × k) selection
ApproximateGreedyCoresetSampler: O(N × k²) for incremental distance updates

For small k (e.g., 10% of N), approximate version is slower but uses much less memory.

Quality Trade-off

The approximate version uses mean distances to random starting points for initialization, which may not find the globally optimal coreset but works well in practice.

Usage in PatchCore

Coreset sampling is applied during the training phase:

patchcore.py

def _fill_memory_bank(self, input_data):
    """Computes and sets the support features for SPADE."""
    _ = self.forward_modules.eval()
    
    def _image_to_features(input_image):
        with torch.no_grad():
            input_image = input_image.to(torch.float).to(self.device)
            return self._embed(input_image)
    
    features = []
    with tqdm.tqdm(
        input_data, desc="Computing support features...", position=1, leave=False
    ) as data_iterator:
        for image in data_iterator:
            if isinstance(image, dict):
                image = image["image"]
            features.append(_image_to_features(image))
    
    features = np.concatenate(features, axis=0)
    features = self.featuresampler.run(features)  # <- Coreset sampling happens here
    
    self.anomaly_scorer.fit(detection_features=[features])

Configuration Examples

Standard (Recommended)
High Performance
No Sampling
Random Baseline

python bin/run_patchcore.py \
  patch_core ... \
  sampler -p 0.1 approx_greedy_coreset \
  dataset ...

Percentage: 10% (0.1)
Method: Approximate greedy coreset
Best for: Most use cases with >50k patches

python bin/run_patchcore.py \
  patch_core ... \
  sampler -p 0.01 approx_greedy_coreset \
  dataset ...

Percentage: 1% (0.01)
Method: Approximate greedy coreset
Best for: Very large datasets, fastest inference

featuresampler = patchcore.sampler.IdentitySampler()

Percentage: 100%
Method: Identity (no sampling)
Best for: Small datasets (under 10k patches)

featuresampler = patchcore.sampler.RandomSampler(percentage=0.1)

Percentage: 10%
Method: Random sampling
Best for: Baseline comparisons

Performance vs. Sampling Rate

Sampling %	Memory	Inference Speed	AUROC (typical)
100% (no sampling)	2.4 GB	1×	99.2%
10% (greedy)	240 MB	8×	99.1%
1% (greedy)	24 MB	50×	98.5%
10% (random)	240 MB	8×	97.8%

Finding: Greedy coreset at 10% typically loses less than 0.1% AUROC while providing 10× speedup!

Visualization: Coreset Coverage

The greedy algorithm ensures good coverage of the feature space:

Feature Space (t-SNE projection):

        ●  ○ ○     ○  ●        ○: All training patches
     ○    ○   ○  ○     ○      ●: Selected coreset samples
   ○  ●    ○○    ●  ○   ○
  ○    ○  ○  ● ○     ○        Note: Coreset samples are
    ○   ○ ○  ○   ○  ●         spread across the space
  ○  ●   ○    ○    ○   ○
    ○   ○  ●    ○  ○  ○
      ○     ○  ○    ●

Implementation Tips

GPU Acceleration: For datasets with >100k patches, ensure coreset sampling runs on GPU by passing the correct device:

sampler = ApproximateGreedyCoresetSampler(
    percentage=0.1,
    device=torch.device('cuda'),  # Run on GPU
    dimension_to_project_features_to=128
)

Percentage Validation: The BaseSampler enforces 0 < percentage < 1:

if not 0 < percentage < 1:
    raise ValueError("Percentage value not in (0, 1).")

Use decimals (0.1 for 10%), not percentages (10).

Next Steps

Anomaly Scoring

Learn how the coreset memory bank is used for scoring

PatchCore Algorithm

See how coreset fits into the full pipeline

Get Started

Core Concepts

Training

Inference

Model Zoo

Why Coreset Sampling?

Sampling Strategies

Identity Sampler

Random Sampler

Greedy Coreset

IdentitySampler

RandomSampler

GreedyCoresetSampler

Dimensionality Reduction for Speed

Core Algorithm

Algorithm Breakdown

Distance Computation

ApproximateGreedyCoresetSampler

Approximate Algorithm

Key Differences

Usage in PatchCore

Configuration Examples

Performance vs. Sampling Rate

Visualization: Coreset Coverage

Implementation Tips

Next Steps

Anomaly Scoring

PatchCore Algorithm

Build docs developers (and LLMs) love

Get Started

Core Concepts

Training

Inference

Model Zoo

​Why Coreset Sampling?

​Sampling Strategies

Identity Sampler

Random Sampler

Greedy Coreset

​IdentitySampler

​RandomSampler

​GreedyCoresetSampler

​Dimensionality Reduction for Speed

​Core Algorithm

​Algorithm Breakdown

​Distance Computation

​ApproximateGreedyCoresetSampler

​Approximate Algorithm

​Key Differences

​Usage in PatchCore

​Configuration Examples

​Performance vs. Sampling Rate

​Visualization: Coreset Coverage

​Implementation Tips

​Next Steps

Anomaly Scoring

PatchCore Algorithm

Build docs developers (and LLMs) love

Why Coreset Sampling?

Sampling Strategies

IdentitySampler

RandomSampler

GreedyCoresetSampler

Dimensionality Reduction for Speed

Core Algorithm

Algorithm Breakdown

Distance Computation

ApproximateGreedyCoresetSampler

Approximate Algorithm

Key Differences

Usage in PatchCore

Configuration Examples

Performance vs. Sampling Rate

Visualization: Coreset Coverage

Implementation Tips

Next Steps