Overview
Thebackbones module provides access to a wide variety of pre-trained neural network architectures for feature extraction in PatchCore. These backbones are loaded with ImageNet pre-trained weights.
Functions
load
Name of the backbone architecture to load. See Available Backbones for the complete list.
Returns
Pre-trained PyTorch model loaded with ImageNet weights.
Example
Available Backbones
The following pre-trained backbones are available:ResNet Family
ResNet-50 architecture from torchvision
ResNet-101 architecture from torchvision
ResNet-200 architecture from timm
ResNeXt-101 (32x8d) architecture from torchvision
ResNeSt-50 (4s2x40d) architecture from timm
Wide ResNet Family
Wide ResNet-50-2 architecture from torchvision. Recommended for PatchCore.
Wide ResNet-101-2 architecture from torchvision
ResNetV2 (BiT) Family
ResNetV2-50x3 trained with Big Transfer (BiT) on ImageNet-1k
ResNetV2-50x3 trained with BiT on ImageNet-21k
ResNetV2-101x3 trained with BiT on ImageNet-1k
ResNetV2-101x3 trained with BiT on ImageNet-21k
ResNetV2-101 standard architecture
ResNetV2-152x4 trained with BiT on ImageNet-1k
ResNetV2-152x4 trained with BiT on ImageNet-21k
ResNetV2-152x2 teacher model with 384x384 input resolution
VGG Family
VGG-11 architecture from torchvision
VGG-19 architecture from torchvision
VGG-19 with batch normalization from torchvision
AlexNet
AlexNet architecture from torchvision
DenseNet Family
DenseNet-121 architecture from timm
DenseNet-201 architecture from timm
EfficientNet Family
EfficientNet-B1 (TensorFlow weights) from timm
EfficientNet-B3 (TensorFlow weights) from timm
EfficientNet-B3a architecture from timm
EfficientNet-B5 (TensorFlow weights) from timm
EfficientNet-B7 (TensorFlow weights) from timm
EfficientNetV2-M (TensorFlow weights) from timm
EfficientNetV2-L (TensorFlow weights) from timm
MNASNet Family
MNASNet 1.0 architecture from timm
MNASNet-A1 architecture from timm
MNASNet-B1 architecture from timm
Vision Transformer (ViT) Family
Vision Transformer Small with 16x16 patches (224x224 input)
Vision Transformer Base with 16x16 patches (224x224 input)
Vision Transformer Large with 16x16 patches (224x224 input)
Vision Transformer Large with ResNet-50 hybrid backbone
DeiT Family
Data-efficient Image Transformer (DeiT) Base with 16x16 patches
DeiT Base with distillation token (16x16 patches)
Swin Transformer Family
Swin Transformer Base with 4x4 patches and 7x7 window size
Swin Transformer Large with 4x4 patches and 7x7 window size
Inception
Inception-V4 architecture from timm
BN-Inception
Batch Normalized Inception trained on ImageNet (requires pretrainedmodels package)
Usage with PatchCore
Recommended Backbone: The original PatchCore paper uses
wideresnet50 with features extracted from layer2 and layer3, which provides excellent performance for most industrial anomaly detection tasks.Backbone Registry
The complete mapping of backbone names to their implementations is stored in the_BACKBONES dictionary:
Dependencies
- torchvision: For ResNet, VGG, AlexNet, and Wide ResNet models
- timm: For Vision Transformers, EfficientNets, and many other architectures
- pretrainedmodels (optional): For BN-Inception
All backbones are loaded with pre-trained weights. The
pretrained=True parameter ensures that ImageNet weights are used, which is crucial for transfer learning in anomaly detection.