Skip to main content

Overview

The run_patchcore.py script is the primary CLI tool for training PatchCore models on industrial anomaly detection datasets. It uses a command-chaining architecture where you specify a main command followed by three required subcommands: patch_core, sampler, and dataset.

Command Structure

python bin/run_patchcore.py [MAIN_OPTIONS] RESULTS_PATH \
  patch_core [PATCHCORE_OPTIONS] \
  sampler [SAMPLER_OPTIONS] SAMPLER_NAME \
  dataset [DATASET_OPTIONS] DATASET_NAME DATA_PATH

Main Command

Arguments

results_path
string
required
Path where training results, models, and evaluation metrics will be saved

Options

--gpu
int
default:"0"
GPU device ID(s) to use for training. Can specify multiple GPUs by repeating the flag.
--gpu 0 --gpu 1  # Use GPUs 0 and 1
--seed
int
default:"0"
Random seed for reproducibility across training runs
--log_group
string
default:"group"
Logging group name for organizing experiment results
--log_project
string
default:"project"
Logging project name for organizing experiment results
--save_segmentation_images
flag
Save visualization images showing anomaly segmentation results. Useful for qualitative analysis.
--save_patchcore_model
flag
Save trained PatchCore model to disk for later evaluation or deployment

Subcommand: patch_core

Configures the PatchCore model architecture and training parameters.

Backbone Options

--backbone_names
string
Short flag: -bPretrained backbone network(s) to extract features. Supports multiple backbones for ensemble models.Supported backbones:
  • wideresnet50 - Wide ResNet-50
  • wideresnet101 - Wide ResNet-101
  • resnext101 - ResNeXt-101
  • densenet201 - DenseNet-201
-b wideresnet50  # Single backbone
-b wideresnet101 -b resnext101 -b densenet201  # Ensemble
--layers_to_extract_from
string
Short flag: -leLayer names from which to extract feature embeddings. For ensemble models with multiple backbones, prefix with backbone index (e.g., 0.layer2, 1.layer2).
# Single backbone
-le layer2 -le layer3

# Ensemble (3 backbones)
-le 0.layer2 -le 0.layer3 \
-le 1.layer2 -le 1.layer3 \
-le 2.features.denseblock2 -le 2.features.denseblock3

Embedding Dimensions

--pretrain_embed_dimension
int
default:"1024"
Dimensionality of pretrained backbone features before projection
--target_embed_dimension
int
default:"1024"
Target dimensionality after projection. Lower values reduce memory usage.

Processing Options

--preprocessing
choice
default:"mean"
Feature preprocessing methodChoices: mean, conv
--aggregation
choice
default:"mean"
Feature aggregation methodChoices: mean, mlp

Patch Parameters

--patchsize
int
default:"3"
Size of the neighborhood patch for anomaly scoring. Larger values provide more spatial context.
  • Detection tasks: typically 3
  • Segmentation tasks: typically 5
--patchscore
string
default:"max"
Method for computing patch-level anomaly scores
--patchoverlap
float
default:"0.0"
Overlap ratio between adjacent patches (0.0 = no overlap, 0.5 = 50% overlap)
--patchsize_aggregate
int
Short flag: -paAdditional patch sizes for multi-scale aggregation

Anomaly Scoring

--anomaly_scorer_num_nn
int
default:"5"
Number of nearest neighbors (k-NN) used for anomaly scoring
  • Detection: typically 1
  • Segmentation: typically 3 or 5

FAISS Options

--faiss_on_gpu
flag
Use GPU-accelerated FAISS for nearest neighbor search. Significantly speeds up inference.
--faiss_num_workers
int
default:"8"
Number of CPU workers for FAISS operations

Subcommand: sampler

Configures coreset sampling strategy for reducing memory bank size.

Arguments

name
string
required
Sampler type to useOptions:
  • identity - Use all training features (no sampling)
  • greedy_coreset - Greedy coreset selection
  • approx_greedy_coreset - Approximate greedy coreset (faster)

Options

--percentage
float
default:"0.1"
Short flag: -pPercentage of training features to retain in the memory bank
  • 0.1 = 10% (faster, less memory)
  • 0.01 = 1% (even faster, minimal accuracy loss)

Subcommand: dataset

Configures dataset loading and preprocessing.

Arguments

name
string
required
Dataset type. Currently supported: mvtec
data_path
path
required
Path to the dataset root directory. Must exist.

Options

--subdatasets
string
required
Short flag: -dDataset categories to train on. For MVTec AD, this includes:bottle, cable, capsule, carpet, grid, hazelnut, leather, metal_nut, pill, screw, tile, toothbrush, transistor, wood, zipper
-d bottle -d cable -d capsule  # Train on specific categories
--train_val_split
float
default:"1.0"
Fraction of training data to use for training (remainder used for validation)
  • 1.0 = use all training data
  • 0.8 = 80% train, 20% validation
--batch_size
int
default:"2"
Batch size for data loading
--num_workers
int
default:"8"
Number of worker processes for data loading
--resize
int
default:"256"
Image resize dimension (before center cropping)
--imagesize
int
default:"224"
Final image size after center cropping
  • 224 - Standard ImageNet size
  • 320 - Higher resolution for better segmentation
--augment
flag
Apply data augmentation during training

Examples

Basic Training (Single Dataset)

python bin/run_patchcore.py \
  --gpu 0 \
  --seed 0 \
  --save_patchcore_model \
  results/my_experiment \
  patch_core \
    -b wideresnet50 \
    -le layer2 -le layer3 \
    --faiss_on_gpu \
    --pretrain_embed_dimension 1024 \
    --target_embed_dimension 1024 \
    --anomaly_scorer_num_nn 1 \
    --patchsize 3 \
  sampler -p 0.1 approx_greedy_coreset \
  dataset \
    --resize 256 \
    --imagesize 224 \
    -d bottle \
    mvtec /path/to/mvtec

Baseline Detection Model (IM224)

Wide ResNet-50, 10% coreset, optimized for detection:
python bin/run_patchcore.py \
  --gpu 0 \
  --seed 0 \
  --save_patchcore_model \
  --log_group IM224_WR50_L2-3_P01_D1024-1024_PS-3_AN-1_S0 \
  --log_project MVTecAD_Results \
  results \
  patch_core \
    -b wideresnet50 \
    -le layer2 -le layer3 \
    --faiss_on_gpu \
    --pretrain_embed_dimension 1024 \
    --target_embed_dimension 1024 \
    --anomaly_scorer_num_nn 1 \
    --patchsize 3 \
  sampler -p 0.1 approx_greedy_coreset \
  dataset \
    --resize 256 \
    --imagesize 224 \
    -d bottle -d cable -d capsule \
    mvtec /path/to/mvtec
Expected Performance: Instance AUROC: 0.992, Pixelwise AUROC: 0.981

Ensemble Model (IM224)

Multiple backbones for improved accuracy:
python bin/run_patchcore.py \
  --gpu 0 \
  --seed 3 \
  --save_patchcore_model \
  --log_group IM224_Ensemble_L2-3_P001_D1024-384_PS-3_AN-1_S3 \
  results \
  patch_core \
    -b wideresnet101 -b resnext101 -b densenet201 \
    -le 0.layer2 -le 0.layer3 \
    -le 1.layer2 -le 1.layer3 \
    -le 2.features.denseblock2 -le 2.features.denseblock3 \
    --faiss_on_gpu \
    --pretrain_embed_dimension 1024 \
    --target_embed_dimension 384 \
    --anomaly_scorer_num_nn 1 \
    --patchsize 3 \
  sampler -p 0.01 approx_greedy_coreset \
  dataset \
    --resize 256 \
    --imagesize 224 \
    -d bottle -d cable -d capsule \
    mvtec /path/to/mvtec
Expected Performance: Instance AUROC: 0.993, Pixelwise AUROC: 0.981

High-Resolution Segmentation (IM320)

Optimized for pixel-level anomaly segmentation:
python bin/run_patchcore.py \
  --gpu 0 \
  --seed 22 \
  --save_patchcore_model \
  --save_segmentation_images \
  results \
  patch_core \
    -b wideresnet50 \
    -le layer2 -le layer3 \
    --faiss_on_gpu \
    --pretrain_embed_dimension 1024 \
    --target_embed_dimension 1024 \
    --anomaly_scorer_num_nn 3 \
    --patchsize 5 \
  sampler -p 0.01 approx_greedy_coreset \
  dataset \
    --resize 366 \
    --imagesize 320 \
    -d bottle \
    mvtec /path/to/mvtec
Expected Performance: Instance AUROC: 0.99, Pixelwise AUROC: 0.984

Output Files

The script creates the following directory structure:
results/
└── [log_project]/
    └── [log_group]/
        ├── models/
        │   └── mvtec_[category]/
        │       ├── *.faiss           # Nearest neighbor index
        │       └── *.pkl             # Model parameters
        ├── segmentation_images/      # (if --save_segmentation_images)
        │   └── mvtec_[category]/
        └── results.csv               # Evaluation metrics

Metrics Computed

The script automatically computes:
  • Instance AUROC: Image-level anomaly detection accuracy
  • Full Pixel AUROC: Pixel-level segmentation accuracy (all images)
  • Anomaly Pixel AUROC: Pixel-level segmentation accuracy (anomalous images only)
All metrics are saved to results.csv with mean scores across all datasets.

Tips

For faster training: Use --faiss_on_gpu and lower coreset percentage (-p 0.01)
For better segmentation: Increase --imagesize to 320, --patchsize to 5, and --anomaly_scorer_num_nn to 3-5
For ensemble models: Use different backbones and prefix layer names with backbone index (e.g., 0.layer2, 1.layer2)

Build docs developers (and LLMs) love