Overview
TheDiffusionProcess class implements the complete diffusion process for image generation, including noise scheduling, forward diffusion (adding noise), reverse diffusion (sampling), and training. It uses a cosine beta schedule and supports both DDPM and DDIM sampling strategies.
Constructor
Parameters
Height and width of the square input images.
Number of image channels (e.g., 1 for grayscale, 3 for RGB).
List of hidden dimensions for each level of the U-Net encoder/decoder. The length determines the number of downsampling/upsampling blocks.
Initial noise variance in the noise schedule. Lower values mean less noise at the beginning of the diffusion process.
Final noise variance in the noise schedule. This determines the maximum noise level at the final diffusion step.
Total number of diffusion timesteps. More steps provide smoother transitions but slower sampling.
Device to run computations on (CPU or CUDA GPU).
Attributes
After initialization, the following attributes are available:Cosine beta schedule tensor of shape
[noise_steps] defining noise variance at each timestep.Alpha values computed as
1.0 - beta_schedule.Cumulative product of alpha values, used in the forward diffusion equation.
Square root of
alpha_cumprod, precomputed for efficiency.Square root of
1 - alpha_cumprod, precomputed for efficiency.The U-Net model used for noise prediction.
Adam optimizer with learning rate 1e-4.
Gradient scaler for mixed precision training (CUDA only).
Methods
add_noise
Add noise to clean images according to the forward diffusion process.Parameters
Clean images tensor of shape
[batch_size, channels, height, width].Timesteps tensor of shape
[batch_size] containing integer timestep indices.Returns
Noisy images at timestep t, shape
[batch_size, channels, height, width].The Gaussian noise that was added, shape
[batch_size, channels, height, width].Implementation
Uses the forward diffusion equation:sample
Generate new samples using DDPM reverse diffusion.Parameters
Number of images to generate.
Returns
Generated images tensor of shape
[num_samples, channels, image_size, image_size], values clamped to [-1, 1].Implementation
Starts with random Gaussian noise and iteratively denoises overnoise_steps timesteps in reverse order. Uses the DDPM sampling algorithm with predicted noise to compute the mean and variance at each step.
sample_ddim
Generate samples using DDIM (Denoising Diffusion Implicit Models) for faster sampling.Parameters
Number of images to generate.
Number of denoising steps. Fewer steps mean faster sampling. Must be in range (0, noise_steps].
Stochasticity parameter.
eta=0 produces deterministic DDIM sampling, eta=1 recovers DDPM behavior.Returns
Generated images tensor of shape
[num_samples, channels, image_size, image_size], values clamped to [-1, 1].Raises
ValueError: Ifddim_stepsis not in the valid range.
Implementation
Based on “Denoising Diffusion Implicit Models” (Song et al., 2020). Allows faster sampling by skipping timesteps while maintaining quality. The deterministic variant (eta=0) produces consistent outputs for the same noise input.train_step
Perform one training step for the diffusion model.Parameters
Clean images tensor of shape
[batch_size, channels, height, width].Returns
MSE loss value between predicted noise and actual noise.
Implementation
- Samples random timesteps for each image in the batch
- Adds noise to images using
add_noise() - Predicts noise using the U-Net model
- Computes MSE loss between predicted and actual noise
- Performs backpropagation with optional mixed precision (AMP)
- Updates model parameters via the optimizer
Usage example
Related classes
- DiffusionModel - The U-Net architecture used for noise prediction
- DiffusionProcessCIFAR - CIFAR-10 variant with linear beta schedule and EMA