Architecture Overview
The Inception-ResNet-v2 U-Net combines the powerful Inception-ResNet-v2 architecture as an encoder with a U-Net decoder, enabling multi-scale feature extraction with residual connections.Key Features
- Encoder: Inception-ResNet-v2 backbone
- Inception blocks: Multi-scale convolutions (1×1, 3×3, 5×5)
- Residual connections: Improved gradient flow
- Block types: block35 (×10), block17 (×20), block8 (×10)
- Decoder: U-Net-style upsampling with skip connections
- Output: 2-class softmax segmentation
Inception-ResNet Block
The core building block combines Inception’s multi-scale processing with ResNet’s residual connections:models/inception.py (88-166)
Block Types
- Block35 (Inception-ResNet-A)
- Block17 (Inception-ResNet-B)
- Block8 (Inception-ResNet-C)
Used in early layers (35×35 resolution)Three branches:
- Branch 0: 1×1 conv (32 filters)
- Branch 1: 1×1 → 3×3 conv (32 → 32 filters)
- Branch 2: 1×1 → 3×3 → 3×3 conv (32 → 48 → 64 filters)
Complete Model Architecture
models/inception.py (169-271)
Network Structure
- Encoder
- Decoder
| Stage | Block Type | Blocks | Output Channels | Resolution |
|---|---|---|---|---|
| Stem | Conv | 3 | 64 | H/2 × W/2 |
| - | Pool | - | 64 | H/4 × W/4 |
| Stem | Conv | 2 | 192 | H/4 × W/4 |
| - | Pool | - | 192 | H/8 × W/8 |
| Mixed 5b | Inception-A | 1 | 320 | H/8 × W/8 |
| conv3 | Block35 | 10 | 320 | H/8 × W/8 |
| Mixed 6a | Reduction-A | 1 | 1088 | H/16 × W/16 |
| conv4 | Block17 | 20 | 1088 | H/16 × W/16 |
| Mixed 7a | Reduction-B | 1 | 2080 | H/32 × W/32 |
| conv5 | Block8 | 10 | 1536 | H/32 × W/32 |
Advantages of Inception-ResNet-v2
Multi-Scale Feature Extraction
- Parallel branches: Captures features at different scales simultaneously
- Factorized convolutions: 1×7 and 7×1 convs reduce parameters
- Efficient computation: Smaller kernels with similar receptive fields
Residual Connections
- Gradient flow: Scaled residual connections prevent vanishing gradients
- Training stability: Easier optimization of very deep networks
- Adaptive scaling: Different scale factors per block type
U-Net Integration
- Skip connections: Preserves spatial details from encoder
- Progressive reconstruction: Gradual upsampling to original resolution
- Multi-level features: Combines semantic and spatial information
Model Weights
Pretrained Weights
The model can optionally load ImageNet-pretrained weights:DigiPathAI Weights
Task-specific weights are available:- digestpath_inception.h5: Trained on DigestPath dataset
- paip_inception.h5: Trained on PAIP dataset
- camelyon_inception.h5: Trained on Camelyon dataset
Input/Output Specifications
Input
- Shape:
(batch, height, width, 3) - Flexible dimensions:
(None, None, 3)for variable sizes - Preprocessing: Standard ImageNet normalization
Output
- Shape:
(batch, height, width, 2) - Classes: [background, tissue]
- Activation: Softmax probabilities
The Inception-ResNet-v2 U-Net excels at capturing multi-scale features through its parallel Inception blocks, making it particularly effective for complex tissue structures with varying scales.
Usage Example
Performance Characteristics
- Parameters: ~55M parameters
- Depth: 164 layers (stem + 40 inception-resnet blocks + decoder)
- Memory: Higher memory footprint due to multiple branches
- Speed: Moderate inference speed
- Accuracy: Excellent for multi-scale features
Related Models
DenseNet U-Net
Dense connectivity for feature reuse
DeepLabv3+
Atrous spatial pyramid pooling