Overview
Unconstrained Neural Autoregressive Flow (UNAF) uses unconstrained monotonic neural networks (UMNN) based on integration rather than positive weights. This allows for more flexible monotonic transformations without architectural constraints.Reference
Unconstrained Monotonic Neural Networks (Wehenkel et al., 2019)https://arxiv.org/abs/1908.05164
Class Definition
Parameters
The number of features in the data.
The number of context features for conditional density estimation.
The number of autoregressive transformations to stack.
Whether features are randomly permuted between transformations. If
False, features alternate between ascending and descending order.The number of signal features for the integrand neural network.
Keyword arguments passed to the
UMNN (unconstrained monotonic neural network) constructor:hidden_features: Hidden layer sizes for the integrand networkactivation: Activation function (default:ELU)
Additional keyword arguments passed to
MaskedAutoregressiveTransform:hidden_features: Hidden layer sizes for the autoregressive networkactivation: Activation function
Usage Example
Conditional Flow
Training Example
Methods
forward(c=None)
Returns a normalizing flow distribution.
Arguments:
c(Tensor, optional): Context tensor of shape(*, context)
NormalizingFlow: A distribution with:sample(shape): Sample from the distributionlog_prob(x): Compute log probability of samplesrsample(shape): Reparameterized sampling
When to Use UNAF
Good for:
- Maximum flexibility in monotonic transformations
- Complex, highly nonlinear distributions
- Research and experimentation
- When architectural constraints are limiting
- You need fast training (use NAF or NSF)
- You need fast sampling (use RealNVP)
- Your data is outside
[-10, 10]and can’t be standardized - You want simpler, more interpretable models (use MAF or NSF)
Tips
-
Standardize your data: UNAF requires features in
[-10, 10]. Always normalize inputs. - Use ELU activation: UNAF uses ELU activation by default, which works well with integration.
-
Tune signal dimension: Start with
signal=16. Increase for more complex distributions. - Be patient: UNAF can be slow to train due to integration but is very expressive.
Architecture Details
UNAF uses integration to create monotonic transformations:- Base distribution: Diagonal Gaussian
N(0, I) - Transformation: Integration of positive integrand functions
- Signal network: Masked MLP predicts signal vectors and constants
- Integrand network: Unconstrained MLP whose integral defines the transformation
- Softclip layers: Inserted between transformations
g is the integrand network (always positive via exp transformation) and signal_i, constant_i are predicted autoregressively.
Unconstrained Monotonic Networks
Key differences from NAF:- No weight constraints: Weights can be any value
- Integration-based: Monotonicity via integration, not positive weights
- Integrand function: Models
dy/dxinstead ofydirectly - Numerical integration: Uses ODE solvers for inversion
UNAF vs NAF
| Property | UNAF | NAF |
|---|---|---|
| Weights | Unconstrained | Positive only |
| Method | Integration | Direct evaluation |
| Flexibility | Higher | Lower |
| Training speed | Slower | Faster |
| Inversion | Numerical | Numerical |
| Stability | Good | Good |
Integration Details
UNAF uses numerical integration:- Forward pass: Integrate from 0 to x
- Inverse pass: Root finding to solve integral equation
- Log determinant: Computed from integrand evaluations
g(t) in the range [1e-3, 1e3] for numerical stability.
Advanced Usage
Custom Integrand Network
High-Dimensional Data
Manual Construction
Computational Considerations
UNAF is computationally intensive:- Training: Slower than NAF due to integration
- Memory: Higher due to integration state
- Inversion: Requires numerical root finding
- Gradients: Computed via adjoint method
- Use smaller networks for the integrand
- Reduce signal dimensions
- Use coupling for high-dimensional data
- Adjust integration tolerances
Comparison with Other Flows
| Property | UNAF | NAF | NSF | MAF |
|---|---|---|---|---|
| Expressivity | Very high | Very high | High | Medium |
| Flexibility | Highest | High | Medium | Low |
| Training speed | Very slow | Slow | Medium | Fast |
| Sampling speed | Slow | Slow | Slow | Slow |
| Implementation | Complex | Medium | Medium | Simple |
Research Applications
UNAF is particularly useful for:- Research: Exploring limits of flow expressivity
- Benchmarking: Comparing against other architectures
- Complex distributions: Multi-modal, long-tailed, irregular
- Ablation studies: Understanding monotonic transformations
Related
- NAF - Constrained monotonic neural networks
- NSF - Spline-based alternative
- UnconstrainedMonotonicTransform - The underlying transformation
- MAF - Simpler baseline
