Skip to main content

Overview

This quickstart guide will show you how to:
  1. Create a normalizing flow
  2. Train it on data using maximum likelihood
  3. Generate samples from the learned distribution
We’ll build a Neural Spline Flow (NSF) to model a conditional distribution p(x | c).

Basic Example

1

Import dependencies

First, import the necessary libraries:
import torch
import zuko
That’s it! Zuko is designed to be minimal and intuitive.
2

Create a normalizing flow

Create a Neural Spline Flow with 3 sample features and 5 context features:
# NSF with 3 sample features and 5 context features
flow = zuko.flows.NSF(
    features=3,           # Dimensionality of x
    context=5,            # Dimensionality of context c
    transforms=3,         # Number of transformation layers
    hidden_features=[128] * 3  # Hidden layer sizes
)
The NSF class creates a Neural Spline Flow with rational-quadratic splines, one of the most powerful and flexible flow architectures available.
The flow is a torch.nn.Module, so you can inspect its parameters:
print(f"Number of parameters: {sum(p.numel() for p in flow.parameters())}")
3

Prepare your data

For this example, let’s create a simple synthetic dataset:
# Generate synthetic training data
# x: samples (batch_size, 3)
# c: context (batch_size, 5)

def generate_data(batch_size=256):
    c = torch.randn(batch_size, 5)
    # Create x that depends on c (for demonstration)
    x = torch.randn(batch_size, 3) + c[:, :3] * 0.5
    return x, c
In real applications, x and c would be your actual data samples and conditioning variables.
4

Train the flow

Train the flow by maximizing the log-likelihood of the data:
# Set up optimizer
optimizer = torch.optim.Adam(flow.parameters(), lr=1e-3)

# Training loop
epochs = 100
for epoch in range(epochs):
    # Generate a batch of training data
    x, c = generate_data(batch_size=256)
    
    # Compute negative log-likelihood
    loss = -flow(c).log_prob(x)  # -log p(x | c)
    loss = loss.mean()
    
    # Optimization step
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    # Print progress
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch + 1}/{epochs}, Loss: {loss.item():.4f}")
The key line is flow(c).log_prob(x): passing context c to the flow returns a distribution p(x | c), which we can then evaluate at points x.
5

Generate samples

Once trained, you can sample from the learned distribution:
# Create a specific context
c_star = torch.randn(1, 5)

# Sample 64 points from p(x | c*)
with torch.no_grad():
    samples = flow(c_star).sample((64,))

print(f"Generated samples shape: {samples.shape}")  # [64, 3]
You can also evaluate the log-probability of new data:
# Evaluate log-probability of new points
x_new = torch.randn(10, 3)
c_new = torch.randn(10, 5)

with torch.no_grad():
    log_prob = flow(c_new).log_prob(x_new)

print(f"Log probabilities: {log_prob}")  # [10]

Complete Working Example

Here’s a complete, runnable script that ties everything together:
import torch
import zuko

# Create the flow
flow = zuko.flows.NSF(
    features=3,
    context=5,
    transforms=3,
    hidden_features=[128] * 3
)

# Move to GPU if available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
flow = flow.to(device)

# Training function
def generate_data(batch_size=256):
    c = torch.randn(batch_size, 5, device=device)
    x = torch.randn(batch_size, 3, device=device) + c[:, :3] * 0.5
    return x, c

# Optimizer
optimizer = torch.optim.Adam(flow.parameters(), lr=1e-3)

# Training loop
print("Training flow...")
for epoch in range(100):
    x, c = generate_data()
    
    loss = -flow(c).log_prob(x).mean()
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch + 1}, Loss: {loss.item():.4f}")

# Sampling
print("\nGenerating samples...")
c_star = torch.randn(1, 5, device=device)
with torch.no_grad():
    samples = flow(c_star).sample((64,))
print(f"Generated {samples.shape[0]} samples of dimension {samples.shape[1]}")

Other Flow Architectures

Zuko provides many pre-built flow architectures. Here are some popular alternatives to NSF:
# Good for density estimation
flow = zuko.flows.MAF(
    features=3,
    context=5,
    transforms=5,
    hidden_features=[128] * 3
)
All flows follow the same API: create the flow, train by maximizing log_prob, and sample from the learned distribution.

Building Custom Flows

For more control, you can build custom flows using the Flow class:
from zuko.flows import Flow, UnconditionalDistribution, UnconditionalTransform
from zuko.flows.autoregressive import MaskedAutoregressiveTransform
from zuko.distributions import DiagNormal
from zuko.transforms import RotationTransform

# Custom flow with specific architecture
flow = Flow(
    transform=[
        # First autoregressive transform
        MaskedAutoregressiveTransform(
            features=3,
            context=5,
            hidden_features=(64, 64)
        ),
        # Rotation layer (unconditional)
        UnconditionalTransform(
            RotationTransform,
            torch.randn(3, 3)
        ),
        # Second autoregressive transform
        MaskedAutoregressiveTransform(
            features=3,
            context=5,
            hidden_features=(64, 64)
        ),
    ],
    # Gaussian base distribution
    base=UnconditionalDistribution(
        DiagNormal,
        loc=torch.zeros(3),
        scale=torch.ones(3),
        buffer=True,
    ),
)
This gives you complete control over:
  • The sequence of transformations
  • The base distribution
  • Which layers are conditional vs unconditional
  • Architecture details of each transformation

Key Takeaways

Simple API

Create flows with a single line, train with standard PyTorch patterns

Conditional Modeling

Built-in support for p(x | c) - just pass context to the flow

Multiple Architectures

12+ pre-built flows from recent research papers

Full Control

Build custom flows with the Flow class for maximum flexibility

What’s Next?

Now that you’ve built your first flow, explore more advanced topics:

Flow Architectures

Learn about different flow architectures and when to use them

API Reference

Detailed documentation of all flows and their parameters

Examples

Real-world examples and tutorials

Custom Flows

Build your own flow architectures

Build docs developers (and LLMs) love