Skip to main content

Overview

Grad-CAM (Gradient-weighted Class Activation Mapping) is a powerful visualization technique that helps you understand which parts of an image your CNN focuses on when making predictions. It highlights discriminative regions by computing gradients of the target class with respect to feature maps.
Grad-CAM works with any CNN architecture without requiring architectural changes or retraining.

How It Works

Grad-CAM generates heatmaps by:
  1. Forward Pass: Process the input image through the model
  2. Target Selection: Identify the class to explain (usually the predicted class)
  3. Backward Pass: Compute gradients of the target class score with respect to feature maps
  4. Weighting: Calculate importance weights by global average pooling of gradients
  5. Combination: Weight the feature maps and combine them to create the final heatmap
Earlier layers detect low-level features (edges, textures), while later layers capture high-level concepts (objects, patterns).

Implementation

The UC Intel Final platform implements Grad-CAM with automatic handling of in-place operations and gradient tracking:

Core Computation Function

def compute_gradcam(
    model: nn.Module,
    device: torch.device,
    image_tensor: torch.Tensor,
    target_layer: nn.Module,
    target_class: int | None = None,
) -> tuple[np.ndarray, int, float]:
    """
    Compute Grad-CAM heatmap for an image.

    Args:
        model: The model
        device: Compute device
        image_tensor: (C, H, W) image tensor
        target_layer: Conv layer to visualize
        target_class: Class to explain (None = predicted class)

    Returns:
        Tuple of (heatmap, predicted_class, confidence)
    """
    gradients = []
    activations = []

    def forward_hook(_module, _inp, output):
        cloned = output.clone()
        activations.append(cloned)
        return cloned

    def backward_hook(_module, _grad_input, grad_output):
        gradients.append(grad_output[0].clone())

    forward_handle = target_layer.register_forward_hook(forward_hook)
    backward_handle = target_layer.register_full_backward_hook(backward_hook)
The implementation automatically disables in-place operations during Grad-CAM computation to avoid autograd conflicts.

Layer Selection

Get all convolutional layers from your model:
def get_conv_layers(model: nn.Module) -> list[tuple[str, nn.Module]]:
    """Get all Conv2d layers from the model."""
    conv_layers = []

    for name, module in model.named_modules():
        if isinstance(module, nn.Conv2d):
            conv_layers.append((name, module))

    return conv_layers

Gradient Processing

The platform computes weighted activations and generates normalized heatmaps:
# Compute importance weights from gradients
weights = grads.mean(dim=(2, 3), keepdim=True)
cam = (weights * acts).sum(dim=1, keepdim=True)
cam = torch.relu(cam)  # Only positive influences

# Normalize to [0, 1]
cam = cam - cam.min()
if cam.max() > 0:
    cam = cam / cam.max()

# Upsample to input size
cam = torch.nn.functional.interpolate(
    cam, size=(224, 224), mode="bilinear", align_corners=False
)

Using Grad-CAM in the Interface

The interpretability interface provides an interactive Grad-CAM visualization:
  1. Select a Sample: Choose from 20 random test images
  2. Choose Target Layer: Select which convolutional layer to visualize
    • Later layers recommended for semantic understanding
    • Earlier layers show low-level feature detection
  3. Adjust Opacity: Control heatmap overlay transparency (0.0 - 1.0)
  4. Generate Visualization: Click to compute Grad-CAM
The interface displays:
  • Original Image: Input image (224x224)
  • Heatmap: Color-coded importance map (warm colors = high importance)
  • Overlay: Combined visualization showing focus regions
  • Top-5 Predictions: Model confidence scores for each class

Interpreting Results

Heatmap Colors

Warm Colors (Red/Yellow)

High importance regions that strongly influence the prediction

Cool Colors (Blue/Purple)

Low importance regions that have minimal impact on classification

Common Patterns

Good Model Behavior:
  • Focuses on discriminative object regions
  • Highlights semantic features relevant to the class
  • Consistent attention across similar images
Potential Issues:
  • Attention on background or irrelevant regions
  • Scattered or noisy activations
  • Focus on dataset artifacts or biases
Compare Grad-CAM visualizations for correct vs. incorrect predictions to understand model weaknesses.

Top Predictions Analysis

The platform provides top-k predictions alongside Grad-CAM:
def get_top_predictions(
    model: nn.Module,
    device: torch.device,
    image_tensor: torch.Tensor,
    class_names: list[str],
    top_k: int = 5,
) -> list[dict]:
    """Get top-k predictions for an image."""
    model.eval()

    with torch.no_grad():
        input_tensor = image_tensor.unsqueeze(0).to(device)
        output = model(input_tensor)
        probs = torch.softmax(output, dim=1)

        top_probs, top_indices = probs.topk(min(top_k, len(class_names)))

    predictions = []
    for prob, idx in zip(top_probs[0], top_indices[0], strict=True):
        predictions.append({
            "class_idx": idx.item(),
            "class_name": class_names[idx.item()],
            "confidence": prob.item(),
        })

    return predictions

Advanced Usage

Visualizing Specific Classes

You can generate Grad-CAM for any target class, not just the predicted one:
# Explain what would activate class 5
heatmap, pred_class, confidence = compute_gradcam(
    model, device, image_tensor, target_layer, target_class=5
)
This helps answer questions like:
  • “What would make the model predict class X?”
  • “Why didn’t the model choose the correct class?”
  • “What features distinguish class A from class B?”

Layer Comparison

Visualize multiple layers to understand hierarchical feature detection:Early Layers (e.g., conv1, layer1):
  • Detect edges, corners, colors
  • High-resolution but low-level features
  • Many small activation regions
Middle Layers (e.g., layer2, layer3):
  • Detect textures, patterns, shapes
  • Intermediate semantic understanding
  • Moderately sized activation regions
Late Layers (e.g., layer4, final conv):
  • Detect objects, concepts, semantic features
  • Low-resolution but high-level understanding
  • Large, coherent activation regions

Technical Considerations

Memory Management

The implementation automatically handles:
  • Gradient tracking: Clones tensors to prevent in-place modification errors
  • Hook cleanup: Removes forward and backward hooks after computation
  • In-place operations: Temporarily disables inplace=True for ReLU and similar layers

Error Handling

if not gradients:
    raise RuntimeError("No gradients captured. Try a different layer.")
Common issues:
  • No gradients: Target layer has no learnable parameters
  • Memory errors: Image too large or model too deep
  • Autograd conflicts: In-place operations (handled automatically)

Activation Maps

Visualize raw filter responses without gradient weighting

t-SNE Embeddings

Visualize learned feature representations in 2D space

References

For more interpretability methods, explore LIME explanations and misclassification analysis in the full interpretability suite.

Build docs developers (and LLMs) love