Prerequisites
Before you begin, make sure you have:Python 3.10-3.12
Any reasonably recent version in the 3.10-3.12 range should work
CUDA GPU (recommended)
Optional but strongly recommended for faster training. CPU training is supported but much slower.
The code automatically detects CUDA and uses GPU acceleration when available. No manual configuration needed.
Local installation
Create a virtual environment
It’s recommended to use a virtual environment to avoid dependency conflicts:
Install dependencies
Install all required packages from This installs:
requirements.txt:torch- PyTorch deep learning frameworktorchvision- Image datasets and transformationstorchaudio- Audio processing utilitiesmatplotlib- Plotting and visualizationtqdm- Progress bars for training loops
Installing PyTorch with CUDA
If you have a CUDA-enabled GPU but the above showsCUDA available: False, you may need to install PyTorch with CUDA support explicitly.
- CUDA 11.8
- CUDA 12.1
- CPU only
Dataset setup
The datasets are automatically downloaded on the first run. No manual setup required!MNIST dataset
MNIST dataset
When you run Dataset size: ~50 MB
python src/training/train_diffusion.py for the first time, the code will automatically:- Download MNIST from
torchvision.datasets - Save it to
data/MNIST/ - Process and cache the data
CIFAR-10 dataset
CIFAR-10 dataset
Similarly, CIFAR-10 is downloaded automatically when running the CIFAR training script:Dataset size: ~170 MB
HPC cluster setup
If you’re using an HPC cluster with SLURM, you can use the provided batch scripts.Environment modules
Most clusters use environment modules for CUDA and Python:Module names vary by cluster. Check your cluster’s documentation or run
module avail to see available modules.SLURM scripts
The repository includes ready-to-use SLURM scripts in theslurm/ directory:
Example SLURM configuration
Here’s what a typical SLURM script looks like:Project structure
After installation, your directory should look like this:Core models (src/models/)
Core models (src/models/)
Contains the U-Net architectures and diffusion processes:
diffusion.py- MNIST model with cosine beta schedulediffusion_cifar.py- CIFAR-10 model with linear schedule and EMA
Training scripts (src/training/)
Training scripts (src/training/)
Main entry points for training:
train_diffusion.py- Train MNIST DDPM (50 epochs, ~5-10 min)train_diffusion_cifar.py- Train CIFAR-10 DDPM (2000 epochs, GPU required)
Utilities (src/utilities/)
Utilities (src/utilities/)
Analysis and comparison scripts:
ddim_comparison_mnist.py- Benchmark DDPM vs DDIM on MNISTddim_comparison_cifar.py- Benchmark DDPM vs DDIM on CIFAR-10interpolation_and_timesteps.py- Latent interpolation and timestep analysis
SLURM scripts (slurm/)
SLURM scripts (slurm/)
Ready-to-use batch scripts for HPC clusters:
run_diffusion_mnist.slurm- MNIST training jobrun_diffusion_cifar.slurm- CIFAR-10 training jobrun_ddim_comparison.slurm- DDPM vs DDIM comparison jobs
Performance optimization
The code includes several optimizations for faster training:Mixed precision training
Automatic mixed precision (AMP) is enabled when using CUDA:CUDNN benchmarking
The training scripts automatically enable CUDNN benchmarking:Data loading
Efficient data loading with multiple workers and pinned memory:Troubleshooting
Out of memory errors
Out of memory errors
If you encounter CUDA out-of-memory errors:
- Reduce
batch_sizein the training script (default is 128) - Reduce
hidden_dimsfor a smaller model - Use gradient accumulation to simulate larger batches
Slow training on CPU
Slow training on CPU
CPU training is significantly slower than GPU training. For MNIST:
- GPU: ~5-10 minutes
- CPU: ~30-60 minutes
- Using a smaller model with fewer
hidden_dims - Reducing the number of
epochs - Using Google Colab or Kaggle for free GPU access
Import errors
Import errors
If you see
ModuleNotFoundError, make sure you:- Activated your virtual environment
- Installed all dependencies from
requirements.txt - Are running scripts from the project root directory
Next steps
Quick start
Train your first MNIST diffusion model in under 10 minutes
Introduction
Learn about the architecture and design philosophy