Overview
Thecommon module provides the essential building blocks for the PatchCore algorithm, including feature extraction from backbone networks, nearest neighbor search implementations, feature aggregation, and anomaly scoring mechanisms.
Classes
NetworkFeatureAggregator
Efficient extraction of intermediate network features from specified layers of a backbone model.Constructor Parameters
Torchvision model or custom PyTorch module to extract features from.
List of layer names to extract features from. Supports dot notation for nested modules (e.g., “layer2.3” or “layer3.0”).
Device to run feature extraction on (“cpu”, “cuda”, or torch.device object).
Methods
forward(images)
Extracts features from the specified layers for the input images.
Input tensor of shape (batch_size, 3, H, W).
Dictionary mapping layer names to their feature tensors. Each feature tensor has shape (batch_size, channels, height, width).
feature_dimensions(input_shape)
Computes the feature channel dimensions for all extraction layers given an input shape.
Shape of input images as (channels, height, width), e.g., (3, 224, 224).
List of channel dimensions for each layer in
layers_to_extract_from.NearestNeighbourScorer
Anomaly scoring class based on nearest neighbor distances in feature space.Constructor Parameters
Number of nearest neighbors to consider when computing anomaly scores.
Nearest neighbor search method instance. Options:
FaissNN(on_gpu, num_workers)- Exact searchApproximateFaissNN(on_gpu, num_workers)- Approximate search using IVF-PQ
Methods
fit(detection_features)
Fits the scorer on training features by building a nearest neighbor search index.
List of feature arrays, where each array has shape (num_samples, feature_dim). Features from multiple layers are concatenated internally.
predict(query_features)
Computes anomaly scores for query samples by finding nearest neighbors in the training set.
List of feature arrays for test samples, matching the format used in
fit().Tuple of (anomaly_scores, query_distances, query_nns):
anomaly_scores(np.ndarray): Mean distance to k-nearest neighbors, shape (num_samples,)query_distances(np.ndarray): Distances to each of k neighbors, shape (num_samples, k)query_nns(np.ndarray): Indices of nearest neighbors, shape (num_samples, k)
save(save_folder, save_features_separately=False, prepend="")
Saves the scorer state to disk.
Directory to save the scorer files.
If True, saves detection features as a separate pickle file.
Prefix string for saved filenames.
load(load_folder, prepend="")
Loads a previously saved scorer from disk.
Directory containing saved scorer files.
Prefix string used when saving the files.
FaissNN
Exact nearest neighbor search using FAISS with optional GPU acceleration.Constructor Parameters
If True, runs nearest neighbor searches on GPU for faster computation.
Number of CPU threads to use for FAISS operations.
Methods
fit(features)
Builds the search index from training features.
Training features of shape (num_samples, feature_dim).
run(n_nearest_neighbours, query_features, index_features=None)
Performs nearest neighbor search.
Number of nearest neighbors to retrieve.
Query features of shape (num_queries, feature_dim).
If provided, builds a temporary index from these features instead of using the fitted index.
Tuple of (distances, indices):
distances(np.ndarray): Shape (num_queries, n_nearest_neighbours)indices(np.ndarray): Shape (num_queries, n_nearest_neighbours)
save(filename) / load(filename)
Save or load the FAISS index.
reset_index()
Resets the search index, freeing memory.
ApproximateFaissNN
Approximate nearest neighbor search using FAISS IVF-PQ for large-scale datasets.FaissNN and uses:
- IndexIVFPQ: Inverted file index with product quantization
- 512 centroids for coarse quantization
- 64 sub-quantizers with 8 bits per code
Approximate search is faster but less accurate than exact search. Best for datasets with >100k samples.
Preprocessing
Neural network module for preprocessing features from multiple layers to a common dimension.Constructor Parameters
List of input feature dimensions from different layers.
Target dimension for all features after preprocessing.
Aggregator
Aggregates multi-layer features into a single feature vector using adaptive average pooling.Constructor Parameters
Target dimension for the aggregated features.
RescaleSegmentor
Converts patch-level anomaly scores to pixel-level segmentation maps.Constructor Parameters
Device for tensor operations.
Target size for upsampling the segmentation maps (square).
Methods
convert_to_segmentation(patch_scores)
Patch-level anomaly scores to convert to segmentation maps.
List of smoothed segmentation maps, one per image in the batch. Each map has shape (target_size, target_size).
- Bilinear upsampling to target size
- Gaussian smoothing with sigma=4
