RealNVP

Overview

RealNVP is a normalizing flow based on coupling transformations. Unlike autoregressive flows, RealNVP can compute both forward and inverse transformations in parallel, making it very fast for both training and sampling.

RealNVP is an alias for NICE with affine transformations. The two classes are equivalent.

Reference

Density estimation using Real NVP (Dinh et al., 2016)
https://arxiv.org/abs/1605.08803

Class Definition

zuko.flows.RealNVP(
    features: int,
    context: int = 0,
    transforms: int = 3,
    randmask: bool = False,
    **kwargs
)

Parameters

features

int

required

The number of features in the data. Must be at least 2 for coupling to work.

context

int

default:"0"

The number of context features for conditional density estimation.

transforms

int

default:"3"

The number of coupling transformations to stack.

randmask

bool

default:"False"

Whether random coupling masks are used. If False, alternating checkered masks are used.

**kwargs

dict

Additional keyword arguments passed to GeneralCouplingTransform:

hidden_features: List of hidden layer sizes (default: [64, 64])
activation: Activation function (default: ReLU)
univariate: Univariate transformation constructor (default: MonotonicAffineTransform)

Usage Example

import torch
import zuko

# Create an unconditional RealNVP
flow = zuko.flows.RealNVP(
    features=10,
    transforms=5,
    hidden_features=[128, 128]
)

# Sample from the flow (fast!)
dist = flow()
samples = dist.sample((1000,))
print(samples.shape)  # torch.Size([1000, 10])

# Compute log probabilities
log_prob = dist.log_prob(samples)
print(log_prob.shape)  # torch.Size([1000])

Conditional Flow

# Create a conditional RealNVP
flow = zuko.flows.RealNVP(
    features=5,
    context=3,
    transforms=4
)

# Sample conditioned on context
context = torch.randn(3)
dist = flow(context)
samples = dist.sample((100,))

Training Example

import torch.optim as optim

flow = zuko.flows.RealNVP(
    features=8,
    transforms=5,
    hidden_features=[256, 256]
)

optimizer = optim.Adam(flow.parameters(), lr=1e-3)

for epoch in range(100):
    for x in dataloader:
        optimizer.zero_grad()
        
        # Maximum likelihood training
        loss = -flow().log_prob(x).mean()
        
        loss.backward()
        optimizer.step()

Random Masks

# Use random masks instead of alternating
flow = zuko.flows.RealNVP(
    features=20,
    transforms=5,
    randmask=True  # Better mixing for structured data
)

Methods

`forward(c=None)`

Returns a normalizing flow distribution. Arguments:

c (Tensor, optional): Context tensor of shape (*, context)

Returns:

NormalizingFlow: A distribution with the following methods:
- sample(shape): Sample from the distribution (fast, parallel)
- log_prob(x): Compute log probability (fast, parallel)
- rsample(shape): Reparameterized sampling

When to Use RealNVP

Good for:

Real-time generation and sampling
Large-scale datasets (fast training)
Applications requiring bidirectional speed
Image generation (with spatial coupling)

Consider alternatives if:

You need maximum expressivity (use NSF or NAF)
You have very complex, multimodal distributions
Features are highly correlated in complex ways

Tips

More transformations: Since each layer only transforms half the features, use more transformations (5-10) compared to autoregressive flows.
Random masks: Set randmask=True when your features have inherent structure or ordering.
Deeper networks: Coupling layers have less capacity, so use larger hidden layers [256, 256] or [512, 512].
Preprocessing: Combine with activation normalization or batch normalization for better performance.

Architecture Details

RealNVP uses coupling transformations:

Base distribution: Diagonal Gaussian N(0, I)
Coupling mechanism: Split features into two groups using a binary mask
Transformation: First group unchanged, second group transformed based on first
Neural network: Standard MLP (no masking needed)

Each coupling layer:

# Binary mask splits features
y_a = x_a  # First half unchanged
y_b = x_b * exp(s(x_a, c)) + t(x_a, c)  # Second half transformed

where s and t are neural networks, and c is optional context.

Coupling vs. Autoregressive

Property	RealNVP (Coupling)	MAF (Autoregressive)
Forward pass	Parallel	Parallel
Inverse pass	Parallel	Sequential
Training speed	Fast	Fast
Sampling speed	Fast	Slow
Expressivity	Medium	Medium
Layers needed	More (5-10)	Fewer (3-5)

Advanced Usage

Custom Masks

import torch

# Define custom coupling mask
mask = torch.tensor([0, 1, 0, 1, 0, 1], dtype=torch.bool)

from zuko.flows.coupling import GeneralCouplingTransform

transform = GeneralCouplingTransform(
    features=6,
    mask=mask
)

Multi-Scale Architecture

# For high-dimensional data, use multi-scale architecture
# This is typically done by combining RealNVP with squeezing operations

from zuko.flows import RealNVP
import torch.nn as nn

class MultiScaleRealNVP(nn.Module):
    def __init__(self, features):
        super().__init__()
        self.flow1 = RealNVP(features, transforms=3)
        # Implement squeezing/factoring here
        self.flow2 = RealNVP(features // 2, transforms=3)

Image Modeling

For image data, use spatial coupling patterns:

# Checkerboard coupling for images
# Alternate which pixels are transformed

H, W, C = 28, 28, 1  # MNIST dimensions
features = H * W * C

flow = zuko.flows.RealNVP(
    features=features,
    transforms=8,
    hidden_features=[1024, 1024],
    randmask=False  # Use structured masks for spatial data
)

NICE - Predecessor with additive transformations
MAF - Autoregressive alternative
NSF - More expressive but slower sampling

Flows

Core Components

Distributions

Transforms

Utilities

Overview

Reference

Class Definition

Parameters

Usage Example

Conditional Flow

Training Example

Random Masks

Methods

`forward(c=None)`

When to Use RealNVP

Tips

Architecture Details

Coupling vs. Autoregressive

Advanced Usage

Custom Masks

Multi-Scale Architecture

Image Modeling

Build docs developers (and LLMs) love

Flows

Core Components

Distributions

Transforms

Utilities

​Overview

​Reference

​Class Definition

​Parameters

​Usage Example

​Conditional Flow

​Training Example

​Random Masks

​Methods

​forward(c=None)

​When to Use RealNVP

​Tips

​Architecture Details

​Coupling vs. Autoregressive

​Advanced Usage

​Custom Masks

​Multi-Scale Architecture

​Image Modeling

​Related

Build docs developers (and LLMs) love

Overview

Reference

Class Definition

Parameters

Usage Example

Conditional Flow

Training Example

Random Masks

Methods

`forward(c=None)`

When to Use RealNVP

Tips

Architecture Details

Coupling vs. Autoregressive

Advanced Usage

Custom Masks

Multi-Scale Architecture

Image Modeling

Related