Activation Map Visualization

Overview

Activation maps reveal what patterns and features convolutional filters detect in input images. By visualizing filter responses at different layers, you can understand the hierarchical feature learning process — from simple edges in early layers to complex semantic concepts in deeper layers.

Activation maps show the raw filter responses without gradient weighting, providing direct insight into what each filter detects.

How It Works

Activation visualization involves:

Forward Pass: Process an image through the model
Hook Registration: Capture intermediate activations at target layer
Extraction: Retrieve activation maps (num_filters × H × W)
Visualization: Display individual filter responses as grayscale images

Bright regions in activation maps indicate strong filter responses — the filter detected its target pattern.

Feature Hierarchy

CNNs learn features hierarchically:

Early Layers

Layer 1-2Detect low-level features:

Edges and contours
Colors and gradients
Simple textures
Corner patterns

Middle Layers

Layer 3-4Detect mid-level features:

Textures and patterns
Simple shapes
Repeated structures
Object parts

Deep Layers

Layer 5+Detect high-level features:

Object concepts
Semantic features
Complex patterns
Class-specific signatures

Implementation

The UC Intel Final platform provides clean activation extraction:

Activation Extraction

def get_activation_maps(
    model: nn.Module,
    device: torch.device,
    image_tensor: torch.Tensor,
    target_layer: nn.Module,
) -> np.ndarray:
    """
    Get activation maps for a specific layer.

    Args:
        model: The model
        device: Compute device
        image_tensor: (C, H, W) image tensor
        target_layer: Layer to visualize

    Returns:
        Activation maps as (num_filters, H, W) numpy array
    """
    activations = []

    def hook_fn(_module, _inp, output):
        activations.append(output.detach().cpu())

    handle = target_layer.register_forward_hook(hook_fn)

    try:
        model.eval()
        with torch.no_grad():
            input_tensor = image_tensor.unsqueeze(0).to(device)
            _ = model(input_tensor)

        if not activations:
            return np.array([])

        acts = activations[0].squeeze(0).numpy()
        return acts

    finally:
        handle.remove()

The function automatically cleans up hooks to prevent memory leaks and unwanted side effects.

Key Design Choices

Detach and CPU Transfer:

activations.append(output.detach().cpu())

detach(): Removes from computation graph, prevents gradient tracking
cpu(): Transfers to CPU for visualization and analysis

No Gradient Tracking:

with torch.no_grad():
    input_tensor = image_tensor.unsqueeze(0).to(device)
    _ = model(input_tensor)

Faster computation
Lower memory usage
No gradient accumulation

Guaranteed Cleanup:

finally:
    handle.remove()

Hook removed even if errors occur
Prevents hooks from accumulating
Essential for repeated visualizations

Filter Weight Visualization

Besides activation maps, you can visualize the actual learned filter weights:

def get_filter_weights(layer: nn.Module) -> np.ndarray | None:
    """
    Get convolutional filter weights for visualization.

    Args:
        layer: Conv2d layer

    Returns:
        Filter weights as (out_channels, in_channels, kH, kW) numpy array
    """
    if not isinstance(layer, nn.Conv2d):
        return None

    weights = layer.weight.detach().cpu().numpy()
    return weights

Weight Normalization for Display

def normalize_filter_for_display(filter_weights: np.ndarray) -> np.ndarray:
    """Normalize filter weights to [0, 1] for display."""
    min_val = filter_weights.min()
    max_val = filter_weights.max()
    if max_val - min_val > 0:
        return (filter_weights - min_val) / (max_val - min_val)
    return np.zeros_like(filter_weights)

Why Normalize Weights?

Filter weights typically have:

Different scales: Some filters have large weights, others small
Positive and negative values: Can’t directly visualize as images
Varying ranges: Makes comparison difficult

Normalization benefits:

Brings all weights to [0, 1] range
Makes visualization consistent
Highlights relative patterns within each filter
Enables meaningful visual comparison

Understanding Activations

Activation Patterns

Strong Activation (Bright Regions):

Filter detected its target feature
Pattern present in the image at that location
Important for downstream processing

Weak Activation (Dark Regions):

Target feature absent or weak
Filter not responsive to local pattern
Less information passed forward

Sparse Activations:

Filter highly selective (good)
Responds only to specific patterns
Common in well-trained networks

Dense Activations:

Filter responds broadly (may need more training)
Less discriminative
Common in early layers or undertrained networks

Layer-Specific Insights

First Convolutional Layer

Characteristics:

Input: Raw RGB pixels (3 channels)
Typical filters: 32-64 filters, 3×3 or 5×5
Patterns detected: Edges, color gradients, simple textures

What to look for:

Edge detectors at various orientations (0°, 45°, 90°, 135°)
Color-sensitive filters (red edges, blue gradients, etc.)
Texture patterns (dots, lines, grids)

Example interpretations:

Horizontal edge detector → bright on horizontal boundaries
Red channel filter → bright on reddish regions
Gabor-like filter → bright on oriented textures

Middle Convolutional Layers

Characteristics:

Input: Feature maps from previous layer (64-256 channels)
Typical filters: 128-512 filters
Patterns detected: Combinations of low-level features, textures, shapes

What to look for:

Repeated texture patterns
Corner and junction detectors
Simple shape components
Directional patterns

Example interpretations:

Grid pattern detector → bright on regular structures
Blob detector → bright on rounded regions
Texture discriminator → bright on specific surface types

Deep Convolutional Layers

Characteristics:

Input: High-level features (256-2048 channels)
Typical filters: 512-2048 filters
Patterns detected: Semantic concepts, object parts, class-specific features

What to look for:

Semantic object detectors
Class-discriminative patterns
Global structure indicators
Invariant feature representations

Example interpretations:

Malware signature detector → bright on suspicious patterns
Benign indicator → bright on clean code structures
Family-specific detector → bright on variant signatures

Visualization Best Practices

Selecting Layers

For general understanding:

Visualize first layer (low-level features)
Visualize one middle layer (mid-level patterns)
Visualize final conv layer (high-level concepts)

For debugging:

Compare activations for correct vs incorrect predictions
Check if deep layers activate on relevant regions
Verify early layers respond to expected patterns

For model comparison:

Same layer across different architectures
Same layer across training epochs
Same layer for different regularization settings

Interpreting Filter Response

Dead Filters

Problem: Filter always produces near-zero activationsCauses:

Bad initialization
Learning rate too high
ReLU dying neuron problem

Solutions:

Check initialization scheme
Reduce learning rate
Use LeakyReLU or other activation

Saturated Filters

Problem: Filter always produces maximum activationsCauses:

Weights too large
Input not normalized
Gradient explosion

Solutions:

Apply weight decay
Normalize inputs
Gradient clipping

Advanced Analysis

Comparing Activations

Correct vs Misclassified Samples:

# Pseudocode for comparative analysis
correct_activations = get_activation_maps(model, device, correct_image, layer)
incorrect_activations = get_activation_maps(model, device, incorrect_image, layer)

# Analyze differences
activation_diff = correct_activations - incorrect_activations

Insights from differences:

Filters that fire differently → discriminative for the class
Filters with similar response → class-agnostic features
Large differences in deep layers → high-level feature mismatch

Receptive Field Understanding

Activation map size decreases with depth: Layer 1 (e.g., 112×112):

Small receptive field (~3×3 pixels)
Each activation corresponds to local region

Layer 3 (e.g., 28×28):

Medium receptive field (~20×20 pixels)
Each activation sees larger context

Layer 5 (e.g., 7×7):

Large receptive field (~100×100 pixels)
Each activation sees most of the image

The receptive field determines how much context each filter can “see” — deeper layers integrate information from larger regions.

Relationship to Grad-CAM

Activation Maps vs Grad-CAM:

Aspect	Activation Maps	Grad-CAM
What	Raw filter responses	Gradient-weighted responses
Shows	What each filter detects	What influences the prediction
Focus	Individual filters	Class-specific importance
Use case	Understanding learned features	Explaining predictions
Output	Many maps (one per filter)	Single heatmap

Use activation maps to understand what the model learned, and Grad-CAM to understand why it made a specific prediction.

Practical Usage Tips

Start Simple

First layer: Verify edge detection and basic feature extraction
Mid layer: Check for reasonable texture and pattern learning
Last conv layer: Ensure semantic feature discrimination

Look for Problems

Warning signs:

All filters look identical (convergence failure)
No activations on test images (dead network)
Noisy, random-looking filters (undertrained)
Activations on irrelevant regions (spurious correlations)

Good signs:

Diverse filter patterns
Interpretable low-level features
Sparse but meaningful activations
Deep filters respond to class-relevant patterns

Combine with Other Methods

With Grad-CAM

Check if important regions (Grad-CAM) have strong activations

With t-SNE

Understand which activations contribute to embedding space structure

Technical Considerations

Memory Usage

Activation maps can be large:

Layer with 512 filters at 28×28 resolution: ~400KB per image
Displaying 100 filters: ~40MB

Mitigation:

Visualize subset of filters (e.g., top 50)
Downsample large activation maps
Process one image at a time

Computational Cost

Extraction is fast:

Forward pass only (no gradients)
Minimal overhead from hooks
Can process many samples quickly

Hook Management

Important: Always remove hooks after use:

try:
    # ... activation extraction ...
finally:
    handle.remove()  # Always cleanup

Why?

Accumulating hooks slows down model
Causes memory leaks
Can interfere with training

Use Cases in Malware Analysis

Understanding Family Signatures

Visualize activations for multiple samples from same family
Identify consistently activated filters
Discover discriminative patterns for that family

Detecting Adversarial Examples

Compare activations of normal vs adversarial samples
Adversarial perturbations often cause unusual activation patterns
Useful for building robust detectors

Model Debugging

Check if early layers learn reasonable features
Verify deep layers develop class-specific detectors
Identify layers that need better regularization

Example Workflow

Complete Analysis Pipeline

Step 1: Select Representative Samples

Choose correctly classified sample from each class
Choose misclassified samples
Choose borderline cases (low confidence)

Step 2: Visualize Early Layer

Check first conv layer
Verify edge detection and low-level features
Look for diverse, interpretable patterns

Step 3: Visualize Middle Layer

Examine texture and pattern learning
Identify discriminative mid-level features
Check activation sparsity

Step 4: Visualize Deep Layer

Analyze semantic feature detection
Compare activations across classes
Identify class-specific filters

Step 5: Cross-Reference with Grad-CAM

Ensure high Grad-CAM importance regions have strong activations
Verify semantic alignment between activation and attention

Step 6: Document Findings

Note which filters are most discriminative
Identify potential model weaknesses
Guide architecture or training improvements

References

Visualization Paper: Visualizing and Understanding Convolutional Networks
Feature Visualization: Feature Visualization
Source Code: app/content/interpret/engine/activations.py
Related Tools: Grad-CAM, LIME, t-SNE embeddings

Activation maps are essential for understanding hierarchical feature learning in CNNs. Combine with gradient-based methods for comprehensive model interpretability.

Get Started

Core Concepts

Dashboard Guide

Training

Model Interpretability

​Overview

​How It Works

​Feature Hierarchy

Early Layers

Middle Layers

Deep Layers

​Implementation

​Activation Extraction

​Key Design Choices

​Filter Weight Visualization

​Weight Normalization for Display

​Understanding Activations

​Activation Patterns

​Layer-Specific Insights

​Visualization Best Practices

​Selecting Layers

​Interpreting Filter Response

Dead Filters

Saturated Filters

​Advanced Analysis

​Comparing Activations

​Receptive Field Understanding

​Relationship to Grad-CAM

​Practical Usage Tips

​Start Simple

​Look for Problems

​Combine with Other Methods

With Grad-CAM

With t-SNE

​Technical Considerations

​Memory Usage

​Computational Cost

​Hook Management

​Use Cases in Malware Analysis

​Understanding Family Signatures

​Detecting Adversarial Examples

​Model Debugging

​Example Workflow

​References

Build docs developers (and LLMs) love

Overview

How It Works

Feature Hierarchy

Implementation

Activation Extraction

Key Design Choices

Filter Weight Visualization

Weight Normalization for Display

Understanding Activations

Activation Patterns

Layer-Specific Insights

Visualization Best Practices

Selecting Layers

Interpreting Filter Response

Advanced Analysis

Comparing Activations

Receptive Field Understanding

Relationship to Grad-CAM

Practical Usage Tips

Start Simple

Look for Problems

Combine with Other Methods

Technical Considerations

Memory Usage

Computational Cost

Hook Management

Use Cases in Malware Analysis

Understanding Family Signatures

Detecting Adversarial Examples

Model Debugging

Example Workflow

References