Skip to main content

Advanced Malware Classification Platform

Train, evaluate, and interpret deep learning models for malware image classification with an interactive Streamlit dashboard and PyTorch backend.

Get started in minutes

Launch the interactive dashboard and start training malware classification models

1

Install dependencies

Clone the repository and install Python dependencies with uv or pip:
git clone https://github.com/OverCV/UC-Intel-Final
cd UC-Intel-Final
pip install -e .
2

Launch the dashboard

Start the Streamlit dashboard to access the full ML workflow:
cd app
streamlit run main.py
The dashboard will open at http://localhost:8501 with pages for dataset configuration, model building, training, and evaluation.
3

Configure your dataset

Navigate to the Dataset page to scan your malware image directory, configure train/validation/test splits, and preview augmentation options.
The platform expects malware images organized by family in a directory structure like malware/family_name/*.png
4

Train your first model

Select a model architecture (custom CNN, transfer learning, or transformer), configure training hyperparameters, and start training with real-time monitoring.

Explore the platform

Discover key features and capabilities

Interactive Dashboard

Multi-page Streamlit application for end-to-end ML workflow with live training monitoring

Model Architectures

Custom CNNs, transfer learning with ResNet/EfficientNet, and vision transformers

Training Pipeline

Complete training orchestration with data augmentation, optimization, and checkpointing

Interpretability Tools

Grad-CAM, t-SNE embeddings, and activation maps for model understanding

Key features

Production-ready ML platform for malware research

Real-time Monitoring

Live training metrics with auto-refreshing charts, progress bars, and GPU memory tracking

Dataset Management

Automated scanning, split configuration, augmentation presets, and sample visualization

Model Evaluation

Comprehensive metrics, confusion matrices, ROC curves, and per-class performance analysis

GPU Acceleration

CUDA support with automatic device detection and memory-efficient training pipelines

Resources

Learn more about the project and research

Research Paper

Academic project report with methodology, experiments, and findings

API Reference

Complete Python API documentation for models, training, and components

GitHub Repository

View source code, notebooks, and report materials

Ready to start training models?

Launch the dashboard or dive into the API to build your malware classification pipeline.

Build docs developers (and LLMs) love