Quickstart
This guide will help you generate your first image with MaxDiffusion using Stable Diffusion XL on a Cloud TPU.Prerequisites
- Cloud TPU VM (v4-8, v5p-8, or v6e-8 recommended)
- MaxDiffusion installed (see Installation)
Generate your first image
Activate your environment
Make sure MaxDiffusion is installed and your virtual environment is activated:
Run SDXL inference
Generate an image with Stable Diffusion XL:The generated image will be saved to
/tmp/my_first_image/.Try different models
- Stable Diffusion 2.1
- Flux Schnell
- Flux Dev
Fast inference with SD 2.1:
Customize generation parameters
Control the output by modifying these key parameters:Prompt and guidance
prompt: Main text description of the imagenegative_prompt: What to avoid in the imageguidance_scale: How closely to follow the prompt (7.0-15.0, higher = more adherence)
Quality settings
num_inference_steps: More steps = higher quality but slower (20-50)resolution: Output image size (512, 1024 for SDXL)
Performance settings
per_device_batch_size: Generate multiple images in parallelattention: Useflashfor faster inference on TPU
Performance benchmarks
Expected generation times on different hardware:| Model | Hardware | Steps | Batch Size | Time |
|---|---|---|---|---|
| Flux Schnell | v6e-4 | 4 | 4 | 0.8s |
| Flux Dev | v6e-4 | 28 | 4 | 5.5s |
| Flux Schnell | v4-8 | 4 | 4 | 2.2s |
| Flux Dev | v4-8 | 28 | 4 | 23s |
| SDXL | v5p-8 | 20 | 2 | ~15s |
Use with LoRA adapters
Load LoRA adapters for style transfer:Learn more about LoRA adapters in the LoRA guide.
Common issues
Out of memory
Out of memory
Reduce batch size or resolution:Or enable gradient checkpointing and offloading in the config file.
Model download fails
Model download fails
Set up HuggingFace authentication:Some models (like Flux) require accepting license terms on HuggingFace.
Slow first run
Slow first run
The first run includes:
- Model weight download (~5-15GB)
- JAX compilation (1-3 minutes)
Next steps
Training guide
Fine-tune models on your own data
LoRA adapters
Use LoRA for style transfer
Video generation
Generate videos with Wan models
Scale to multi-host
Deploy at scale with XPK