Skip to main content

Overview

The Model Builder page (/model) provides an interactive interface for designing, configuring, and managing neural network architectures. Models are saved to a Model Library for reuse across experiments.
The Model Builder requires a configured dataset to determine the number of output classes. Complete Dataset Configuration first.

Page Structure

The Model page has two main sections:
  1. Model Library - Grid of saved models with edit/delete actions
  2. Model Editor - Design interface for creating or editing models

Model Library

The library displays all saved models as cards in a 3-column grid.

Model Card Contents

Each card shows:
  • Model Name: User-defined identifier
  • Model Type: Custom CNN, Transformer, or Transfer Learning
  • Status Badge: Architecture-specific details
    • CNN: Number of convolutional blocks
    • Transformer: Depth and embedding dimension
    • Transfer Learning: Base model name (e.g., “ResNet50”)

Card Actions

Click anywhere on the card to load the model into the editor
  • Selected card is highlighted
  • Configuration loads into editor form

New Model Button

+ New Model (top-right, primary button)
  • Opens editor in creation mode
  • Resets all fields to defaults

Model Editor

The editor appears below the library when creating or editing a model.

Model Name Input

Enter a descriptive name for your model:
  • Example: “ResNet50_v1”, “CNN_4blocks”, “ViT_Base”
  • Required before saving
  • Used in library cards and experiment selection

Architecture Type Selection

Choose from 3 architecture types using segmented control:
🔧 Convolutional Neural NetworkBuild a custom CNN with configurable convolutional blocks, pooling, and dense layers. Best for learning representations from scratch.Use when:
  • You have sufficient training data
  • You want full control over architecture
  • You’re experimenting with novel designs

Custom CNN Configuration

Architecture Builder

The CNN builder uses a visual block-based interface.

Input Layer

Automatically configured based on dataset preprocessing:
  • Input Size: 224x224 (from dataset config)
  • Channels: 3 (RGB) or 1 (Grayscale)

Convolutional Blocks

Add Block button creates a new convolutional block. Each block contains:
  • Filters: Number of output channels (16-512)
  • Kernel Size: 3x3 (default) or 5x5
  • Activation: ReLU, LeakyReLU, or ELU
Block Actions:
  • Move Up: Reorder block earlier in sequence
  • Move Down: Reorder block later in sequence
  • Delete: Remove block (requires ≥1 block)
A valid CNN requires at least 1 convolutional block and at least 1 dense layer.

Global Pooling

Between convolutional and dense layers:
  • Global Average Pooling: Pools each feature map to a single value
  • Reduces parameters and prevents overfitting

Dense Layers

Add Dense Layer button creates fully-connected layers. Each dense layer has:
  • Units: Number of neurons (64-2048)
  • Activation: ReLU, LeakyReLU, ELU, or None
  • Dropout: Dropout probability (0.0-0.7)
Actions: Move Up, Move Down, Delete

Output Layer

Automatically configured:
  • Units: Number of classes (from dataset config)
  • Activation: Softmax for multi-class classification

Architecture Validation

The dashboard validates your architecture in real-time: Valid Configuration:
  • At least 1 convolutional block
  • At least 1 dense layer
  • Output layer matches dataset classes
Invalid Configuration:
  • Missing convolutional blocks
  • Missing dense layers
  • “Save to Library” button disabled

Example CNN Architectures

Blocks:
  1. Conv2D(32, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
  2. Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
Dense:
  • GlobalAvgPool → Dense(128, ReLU) → Dropout(0.5) → Output
Parameters: ~50k
Blocks:
  1. Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
  2. Conv2D(128, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
  3. Conv2D(256, 3x3) → BatchNorm → MaxPool → Dropout(0.3)
Dense:
  • GlobalAvgPool → Dense(512, ReLU) → Dropout(0.5) → Dense(256, ReLU) → Dropout(0.5) → Output
Parameters: ~500k
Blocks:
  1. Conv2D(64, 3x3) → BatchNorm → Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.2)
  2. Conv2D(128, 3x3) → BatchNorm → Conv2D(128, 3x3) → BatchNorm → MaxPool → Dropout(0.3)
  3. Conv2D(256, 3x3) → BatchNorm → Conv2D(256, 3x3) → BatchNorm → MaxPool → Dropout(0.4)
  4. Conv2D(512, 3x3) → BatchNorm → Conv2D(512, 3x3) → BatchNorm → MaxPool → Dropout(0.5)
Dense:
  • GlobalAvgPool → Dense(1024, ReLU) → Dropout(0.5) → Dense(512, ReLU) → Dropout(0.5) → Output
Parameters: ~5M

Transformer Configuration

Vision Transformer (ViT) architecture for malware classification.

Patch Embedding

  • Patch Size: 16x16 (default) or 8x8
    • 16x16: 224/16 = 14×14 = 196 patches
    • 8x8: 224/8 = 28×28 = 784 patches (more compute)

Transformer Encoder

  • 768 (ViT-Base, default)
  • 512 (ViT-Small)
  • 1024 (ViT-Large)
Dimension of patch embeddings and hidden states.

Transformer Presets

ViT-Small

  • Patch: 16x16
  • Embed: 512
  • Depth: 6
  • Heads: 8
  • Parameters: ~22M

ViT-Base

  • Patch: 16x16
  • Embed: 768
  • Depth: 12
  • Heads: 12
  • Parameters: ~86M

ViT-Large

  • Patch: 16x16
  • Embed: 1024
  • Depth: 24
  • Heads: 16
  • Parameters: ~307M
Transformers require large datasets (>10k images) and significant GPU memory. Start with ViT-Small for initial experiments.

Transfer Learning Configuration

Use pretrained ImageNet models as starting points.

Base Model Selection

  • ResNet50 (25.6M params)
  • ResNet101 (44.5M params)
  • ResNet152 (60.2M params)
Deep residual networks with skip connections. Excellent for image classification.

Pretrained Weights

  • ImageNet: Weights trained on ImageNet-1k (default)
  • Random: Initialize randomly (no transfer learning)

Fine-Tuning Strategy

Classifier Head Configuration

1

Global Pooling

  • Enable: Adds GlobalAveragePooling2D after base model
  • Recommended for reducing parameters
2

Dense Layer

  • Add Dense Layer: Checkbox to add intermediate dense layer
  • Dense Units: 256-2048 neurons (default: 512)
  • Provides additional capacity for malware-specific features
3

Dropout

  • Dropout Rate: 0.0-0.7 (default: 0.5)
  • Regularization before output layer

Transfer Learning Example

ResNet50 Feature Extraction Setup:
Base Model: ResNet50
Weights: ImageNet
Strategy: Feature Extraction
Global Pooling: Enabled
Dense Layer: 512 units, ReLU
Dropout: 0.5
Output: 25 classes (Softmax)

Trainable Parameters: ~13M (classifier only)
Frozen Parameters: ~23M (ResNet50 base)

Model Summary

After configuring your architecture, a Model Summary appears showing:

Architecture Overview

  • Number of convolutional blocks
  • Number of dense layers
  • Total parameters (trainable + non-trainable)
  • Model depth

Parameter Count

Displays:
  • Total Parameters: Sum of all weights
  • Trainable Parameters: Parameters updated during training
  • Non-Trainable Parameters: Frozen weights (transfer learning)
Parameter count helps estimate training time and memory requirements. More parameters = longer training and more GPU memory.

Saving Models

Save to Library Button

Save to Library or Update Model (bottom-right) Requirements:
  • Model name is not empty
  • Architecture is valid (CNN only)
  • All required fields completed
On Save:
  • Model added to library with unique ID
  • Configuration saved to session state
  • Model card appears in library grid
  • Success message with “Model '' saved to library!”
  • Editor closes automatically
Create multiple model variations (e.g., “CNN_Light”, “CNN_Deep”, “ResNet50_v1”) to compare architectures in experiments.

Tips & Best Practices

Start Simple: Begin with lightweight architectures (2-3 CNN blocks or ResNet50 feature extraction) before scaling up.
Use Transfer Learning for Small Datasets: If you have <5k images, use pretrained models with feature extraction.
Enable Batch Normalization: For CNNs, BatchNorm after each convolution stabilizes training.
Transformers require large datasets. With <10k images, use CNNs or Transfer Learning instead.
Global Average Pooling: Always use GlobalAvgPool instead of Flatten to reduce parameters and improve generalization.

Next Steps

After saving your model:

Training Configuration

Configure optimizers, learning rates, and training callbacks

Build docs developers (and LLMs) love