Overview
Grad-CAM (Gradient-weighted Class Activation Mapping) is a powerful visualization technique that helps you understand which parts of an image your CNN focuses on when making predictions. It highlights discriminative regions by computing gradients of the target class with respect to feature maps.Grad-CAM works with any CNN architecture without requiring architectural changes or retraining.
How It Works
Grad-CAM generates heatmaps by:- Forward Pass: Process the input image through the model
- Target Selection: Identify the class to explain (usually the predicted class)
- Backward Pass: Compute gradients of the target class score with respect to feature maps
- Weighting: Calculate importance weights by global average pooling of gradients
- Combination: Weight the feature maps and combine them to create the final heatmap
Implementation
The UC Intel Final platform implements Grad-CAM with automatic handling of in-place operations and gradient tracking:Core Computation Function
The implementation automatically disables in-place operations during Grad-CAM computation to avoid autograd conflicts.
Layer Selection
Get all convolutional layers from your model:Gradient Processing
The platform computes weighted activations and generates normalized heatmaps:Using Grad-CAM in the Interface
The interpretability interface provides an interactive Grad-CAM visualization:Step-by-Step Usage
Step-by-Step Usage
- Select a Sample: Choose from 20 random test images
- Choose Target Layer: Select which convolutional layer to visualize
- Later layers recommended for semantic understanding
- Earlier layers show low-level feature detection
- Adjust Opacity: Control heatmap overlay transparency (0.0 - 1.0)
- Generate Visualization: Click to compute Grad-CAM
- Original Image: Input image (224x224)
- Heatmap: Color-coded importance map (warm colors = high importance)
- Overlay: Combined visualization showing focus regions
- Top-5 Predictions: Model confidence scores for each class
Interpreting Results
Heatmap Colors
Warm Colors (Red/Yellow)
High importance regions that strongly influence the prediction
Cool Colors (Blue/Purple)
Low importance regions that have minimal impact on classification
Common Patterns
Good Model Behavior:- Focuses on discriminative object regions
- Highlights semantic features relevant to the class
- Consistent attention across similar images
- Attention on background or irrelevant regions
- Scattered or noisy activations
- Focus on dataset artifacts or biases
Top Predictions Analysis
The platform provides top-k predictions alongside Grad-CAM:Advanced Usage
Visualizing Specific Classes
You can generate Grad-CAM for any target class, not just the predicted one:- “What would make the model predict class X?”
- “Why didn’t the model choose the correct class?”
- “What features distinguish class A from class B?”
Layer Comparison
Comparing Different Layers
Comparing Different Layers
Visualize multiple layers to understand hierarchical feature detection:Early Layers (e.g.,
conv1, layer1):- Detect edges, corners, colors
- High-resolution but low-level features
- Many small activation regions
layer2, layer3):- Detect textures, patterns, shapes
- Intermediate semantic understanding
- Moderately sized activation regions
layer4, final conv):- Detect objects, concepts, semantic features
- Low-resolution but high-level understanding
- Large, coherent activation regions
Technical Considerations
Memory Management
The implementation automatically handles:- Gradient tracking: Clones tensors to prevent in-place modification errors
- Hook cleanup: Removes forward and backward hooks after computation
- In-place operations: Temporarily disables
inplace=Truefor ReLU and similar layers
Error Handling
- No gradients: Target layer has no learnable parameters
- Memory errors: Image too large or model too deep
- Autograd conflicts: In-place operations (handled automatically)
Related Techniques
Activation Maps
Visualize raw filter responses without gradient weighting
t-SNE Embeddings
Visualize learned feature representations in 2D space
References
- Paper: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
- Source Code:
app/content/interpret/engine/gradcam.py - UI Implementation:
app/content/interpret/sections/gradcam.py
For more interpretability methods, explore LIME explanations and misclassification analysis in the full interpretability suite.