Welcome to MaxDiffusion

MaxDiffusion is a collection of reference implementations of various latent diffusion models written in pure Python/JAX that run on XLA devices including Cloud TPUs and GPUs. MaxDiffusion aims to be a launching off point for ambitious diffusion projects both in research and production.

Quickstart

Generate your first image in minutes

Training

Train and fine-tune diffusion models

Inference

Generate images and videos at scale

API Reference

Explore the complete API

Supported models

MaxDiffusion provides production-ready implementations for:

Stable Diffusion 1.x, 2.x, and XL - Training and inference
Flux Dev and Schnell - Training and inference with LoRA support
Wan 2.1/2.2 - Text-to-video and image-to-video generation
LTX-Video - Text-to-video and image-to-video generation
ControlNet - Spatial conditioning for SD 1.4 and SDXL
Dreambooth - Personalized fine-tuning for SD 1.x and 2.x

Key features

Multi-LoRA loading

Load and blend multiple LoRA adapters for inference

Flash attention

Optimized attention kernels for TPU and GPU

Distributed training

FSDP, data, and tensor parallelism for TPU Pods

Mixed precision

bfloat16 training with configurable precision

Why MaxDiffusion?

Built for XLA devices

MaxDiffusion is designed from the ground up for Google Cloud TPUs and GPUs, with extensive optimizations for:

TPU v5p and v6e (Trillium) - Optimized flash attention block sizes and LIBTPU flags
Multi-host training - Scale to hundreds of TPU chips with XPK
Efficient memory usage - Gradient checkpointing and offloading strategies

Production ready

Pure JAX implementation - Full XLA compilation for maximum performance
HuggingFace compatibility - Load and save models in Diffusers format
Orbax checkpointing - Efficient distributed checkpointing
Comprehensive configuration - YAML-based config system

Research friendly

Modular architecture - Easy to fork and modify
Latest models - Flux, Wan 2.1/2.2, LTX-Video support
Advanced features - LoRA, quantization, custom schedulers

MaxDiffusion started as a fork of HuggingFace Diffusers and maintains compatibility with HuggingFace models and pipelines.

Hardware support

Recommended for production

TPU v4, v5p, v6e (Trillium)
Single host and multi-host configurations
Flash attention optimized for TPU architecture
Async collectives for distributed training

What’s new?

2026/01/29: Wan LoRA for inference is now supported
2026/01/15: Wan2.1 and Wan2.2 Img2vid generation is now supported
2025/11/11: Wan2.2 txt2vid generation is now supported
2025/10/10: Wan2.1 txt2vid training and generation is now supported
2025/10/14: NVIDIA DGX Spark Flux support
2025/08/14: LTX-Video img2vid generation is now supported
2025/07/29: LTX-Video text2vid generation is now supported
2025/04/17: Flux Finetuning
2025/02/12: Flux LoRA for inference
2025/02/08: Flux schnell & dev inference

Next steps

Installation

Set up MaxDiffusion on TPU or GPU

Quickstart

Generate your first image

Training guide

Learn how to train models

Deployment

Deploy at scale with XPK

Getting Started

Core Concepts

Training

Inference

Advanced Features

Deployment

Guides

Introduction

Welcome to MaxDiffusion

Quickstart

Training

Inference

API Reference

Supported models

Key features

Multi-LoRA loading

Flash attention

Distributed training

Mixed precision

Why MaxDiffusion?

Built for XLA devices

Production ready

Research friendly

Hardware support

What’s new?

Next steps

Installation

Quickstart

Training guide

Deployment

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Training

Inference

Advanced Features

Deployment

Guides

​Welcome to MaxDiffusion

Quickstart

Training

Inference

API Reference

​Supported models

​Key features

Multi-LoRA loading

Flash attention

Distributed training

Mixed precision

​Why MaxDiffusion?

​Built for XLA devices

​Production ready

​Research friendly

​Hardware support

​What’s new?

​Next steps

Installation

Quickstart

Training guide

Deployment

Build docs developers (and LLMs) love

Welcome to MaxDiffusion

Supported models

Key features

Why MaxDiffusion?

Built for XLA devices

Production ready

Research friendly

Hardware support

What’s new?

Next steps