Overview
The Model Builder page (/model) provides an interactive interface for designing, configuring, and managing neural network architectures. Models are saved to a Model Library for reuse across experiments.
The Model Builder requires a configured dataset to determine the number of output classes. Complete Dataset Configuration first.
Page Structure
The Model page has two main sections:- Model Library - Grid of saved models with edit/delete actions
- Model Editor - Design interface for creating or editing models
Model Library
The library displays all saved models as cards in a 3-column grid.Model Card Contents
Each card shows:- Model Name: User-defined identifier
- Model Type: Custom CNN, Transformer, or Transfer Learning
- Status Badge: Architecture-specific details
- CNN: Number of convolutional blocks
- Transformer: Depth and embedding dimension
- Transfer Learning: Base model name (e.g., “ResNet50”)
Card Actions
- Select
- Edit
- Delete
Click anywhere on the card to load the model into the editor
- Selected card is highlighted
- Configuration loads into editor form
New Model Button
+ New Model (top-right, primary button)- Opens editor in creation mode
- Resets all fields to defaults
Model Editor
The editor appears below the library when creating or editing a model.Model Name Input
Enter a descriptive name for your model:- Example: “ResNet50_v1”, “CNN_4blocks”, “ViT_Base”
- Required before saving
- Used in library cards and experiment selection
Architecture Type Selection
Choose from 3 architecture types using segmented control:- Custom CNN
- Transformer
- Transfer Learning
🔧 Convolutional Neural NetworkBuild a custom CNN with configurable convolutional blocks, pooling, and dense layers. Best for learning representations from scratch.Use when:
- You have sufficient training data
- You want full control over architecture
- You’re experimenting with novel designs
Custom CNN Configuration
Architecture Builder
The CNN builder uses a visual block-based interface.Input Layer
Automatically configured based on dataset preprocessing:- Input Size: 224x224 (from dataset config)
- Channels: 3 (RGB) or 1 (Grayscale)
Convolutional Blocks
Add Block button creates a new convolutional block. Each block contains:- Convolution
- Batch Normalization
- Pooling
- Dropout
- Filters: Number of output channels (16-512)
- Kernel Size: 3x3 (default) or 5x5
- Activation: ReLU, LeakyReLU, or ELU
- Move Up: Reorder block earlier in sequence
- Move Down: Reorder block later in sequence
- Delete: Remove block (requires ≥1 block)
A valid CNN requires at least 1 convolutional block and at least 1 dense layer.
Global Pooling
Between convolutional and dense layers:- Global Average Pooling: Pools each feature map to a single value
- Reduces parameters and prevents overfitting
Dense Layers
Add Dense Layer button creates fully-connected layers. Each dense layer has:- Units: Number of neurons (64-2048)
- Activation: ReLU, LeakyReLU, ELU, or None
- Dropout: Dropout probability (0.0-0.7)
Output Layer
Automatically configured:- Units: Number of classes (from dataset config)
- Activation: Softmax for multi-class classification
Architecture Validation
The dashboard validates your architecture in real-time: ✅ Valid Configuration:- At least 1 convolutional block
- At least 1 dense layer
- Output layer matches dataset classes
- Missing convolutional blocks
- Missing dense layers
- “Save to Library” button disabled
Example CNN Architectures
Lightweight CNN (Fast Training)
Lightweight CNN (Fast Training)
Blocks:
- Conv2D(32, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
- Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
- GlobalAvgPool → Dense(128, ReLU) → Dropout(0.5) → Output
Standard CNN (Balanced)
Standard CNN (Balanced)
Blocks:
- Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
- Conv2D(128, 3x3) → BatchNorm → MaxPool → Dropout(0.25)
- Conv2D(256, 3x3) → BatchNorm → MaxPool → Dropout(0.3)
- GlobalAvgPool → Dense(512, ReLU) → Dropout(0.5) → Dense(256, ReLU) → Dropout(0.5) → Output
Deep CNN (High Capacity)
Deep CNN (High Capacity)
Blocks:
- Conv2D(64, 3x3) → BatchNorm → Conv2D(64, 3x3) → BatchNorm → MaxPool → Dropout(0.2)
- Conv2D(128, 3x3) → BatchNorm → Conv2D(128, 3x3) → BatchNorm → MaxPool → Dropout(0.3)
- Conv2D(256, 3x3) → BatchNorm → Conv2D(256, 3x3) → BatchNorm → MaxPool → Dropout(0.4)
- Conv2D(512, 3x3) → BatchNorm → Conv2D(512, 3x3) → BatchNorm → MaxPool → Dropout(0.5)
- GlobalAvgPool → Dense(1024, ReLU) → Dropout(0.5) → Dense(512, ReLU) → Dropout(0.5) → Output
Transformer Configuration
Vision Transformer (ViT) architecture for malware classification.Patch Embedding
- Patch Size: 16x16 (default) or 8x8
- 16x16: 224/16 = 14×14 = 196 patches
- 8x8: 224/8 = 28×28 = 784 patches (more compute)
Transformer Encoder
- Embedding Dimension
- Depth (Layers)
- Attention Heads
- MLP Ratio
- Dropout
- 768 (ViT-Base, default)
- 512 (ViT-Small)
- 1024 (ViT-Large)
Transformer Presets
ViT-Small
- Patch: 16x16
- Embed: 512
- Depth: 6
- Heads: 8
- Parameters: ~22M
ViT-Base
- Patch: 16x16
- Embed: 768
- Depth: 12
- Heads: 12
- Parameters: ~86M
ViT-Large
- Patch: 16x16
- Embed: 1024
- Depth: 24
- Heads: 16
- Parameters: ~307M
Transfer Learning Configuration
Use pretrained ImageNet models as starting points.Base Model Selection
- ResNet Family
- EfficientNet Family
- VGG Family
- MobileNet
- ResNet50 (25.6M params)
- ResNet101 (44.5M params)
- ResNet152 (60.2M params)
Pretrained Weights
- ImageNet: Weights trained on ImageNet-1k (default)
- Random: Initialize randomly (no transfer learning)
Fine-Tuning Strategy
- Feature Extraction (Recommended)
- Fine-Tuning
- Full Fine-Tuning
Freeze all base model layers, train only new classifier.
- Fast training
- Works with small datasets
- Base model acts as fixed feature extractor
Classifier Head Configuration
Global Pooling
- Enable: Adds GlobalAveragePooling2D after base model
- Recommended for reducing parameters
Dense Layer
- Add Dense Layer: Checkbox to add intermediate dense layer
- Dense Units: 256-2048 neurons (default: 512)
- Provides additional capacity for malware-specific features
Transfer Learning Example
ResNet50 Feature Extraction Setup:Model Summary
After configuring your architecture, a Model Summary appears showing:Architecture Overview
- Custom CNN
- Transformer
- Transfer Learning
- Number of convolutional blocks
- Number of dense layers
- Total parameters (trainable + non-trainable)
- Model depth
Parameter Count
Displays:- Total Parameters: Sum of all weights
- Trainable Parameters: Parameters updated during training
- Non-Trainable Parameters: Frozen weights (transfer learning)
Parameter count helps estimate training time and memory requirements. More parameters = longer training and more GPU memory.
Saving Models
Save to Library Button
Save to Library or Update Model (bottom-right) Requirements:- Model name is not empty
- Architecture is valid (CNN only)
- All required fields completed
- Model added to library with unique ID
- Configuration saved to session state
- Model card appears in library grid
- Success message with “Model '' saved to library!”
- Editor closes automatically
Tips & Best Practices
Next Steps
After saving your model:Training Configuration
Configure optimizers, learning rates, and training callbacks