Skip to main content
Dedalus uses HDF5 (Hierarchical Data Format 5) for efficient storage of simulation data. The framework provides flexible file handlers for saving field data, scalars, and analysis tasks during simulation runs.

File Handlers

File handlers manage when and how simulation data is written to disk. They are added to the solver’s evaluator and triggered based on time or iteration criteria.

Creating a File Handler

import dedalus.public as d3
import numpy as np

# After building your solver...
solver = problem.build_solver(d3.RK222)

# Add a file handler
snapshots = solver.evaluator.add_file_handler(
    'snapshots',           # Base path
    sim_dt=0.25,          # Output every 0.25 simulation time units
    max_writes=50         # Maximum writes per file
)

# Add tasks to save
snapshots.add_task(u, name='velocity')
snapshots.add_task(p, name='pressure')
snapshots.add_task(d3.curl(u), name='vorticity')

Scheduling Options

File handlers support multiple scheduling modes:
# Output every 0.1 simulation time units
handler = solver.evaluator.add_file_handler(
    'output',
    sim_dt=0.1
)
Triggers based on simulation time - useful for maintaining temporal resolution.

Adding Tasks

Tasks define what data to save:
import dedalus.public as d3

# Basic field output
handler.add_task(u, name='velocity')

# Derived quantities
handler.add_task(d3.div(u), name='divergence')
handler.add_task(-d3.div(d3.skew(u)), name='vorticity')
handler.add_task(np.sqrt(u@u), name='speed')

# Scalar diagnostics
handler.add_task(d3.integ(0.5*u@u), name='kinetic_energy')
handler.add_task(d3.integ(p), name='mean_pressure')

# Specify layout and scales
handler.add_task(u, name='velocity_grid', layout='g', scales=2)  # Grid space, 2x resolution
handler.add_task(u, name='velocity_coeff', layout='c')  # Coefficient space
By default, tasks are output in grid space (layout='g') at native resolution (scales=1).

Parallel I/O Modes

Dedalus provides three parallel I/O strategies:

gather

Serial WriteAll processes send data to rank 0, which writes a single HDF5 file.
  • Simple structure
  • Can be slow for large parallel runs
  • Default for serial runs
handler = solver.evaluator.add_file_handler(
    'output',
    sim_dt=0.1,
    parallel='gather'
)

virtual

Virtual DatasetEach process writes its local data to separate files. An HDF5 virtual dataset provides a unified view.
  • Fast parallel writes
  • Files can be merged post-simulation
  • Default for parallel runs
  • Recommended
handler = solver.evaluator.add_file_handler(
    'output',
    sim_dt=0.1,
    parallel='virtual'
)

mpio

MPI-IOUses collective MPI-IO for parallel writes to a single file.
  • Requires parallel HDF5
  • Performance varies by filesystem
  • Best on some HPC systems
handler = solver.evaluator.add_file_handler(
    'output',
    sim_dt=0.1,
    parallel='mpio'
)

HDF5 File Structure

Dedalus HDF5 files have a standardized structure:
snapshots_s1.h5
├── scales/
│   ├── sim_time          # Simulation time for each write
│   ├── wall_time         # Wall time for each write
│   ├── iteration         # Iteration number for each write
│   ├── timestep          # Timestep for each write
│   ├── write_number      # Global write number
│   ├── x/                # Coordinate grids
│   │   └── 1.0           # Grid at scale=1.0
│   ├── y/
│   │   └── 1.0
│   └── z/
│       └── 1.0
├── tasks/
│   ├── velocity          # Shape: (writes, [components,] Nx, Ny, Nz)
│   ├── pressure          # Shape: (writes, Nx, Ny, Nz)
│   └── vorticity
└── (attributes)
    ├── writes            # Number of writes in file
    ├── set_number        # Set number
    └── version           # Dedalus version

Multiple Sets

When max_writes is reached, a new set is created:
snapshots/
├── snapshots_s1.h5       # First 50 writes
├── snapshots_s2.h5       # Next 50 writes
├── snapshots_s3.h5       # Next 50 writes
└── ...

Reading Output Data

Dedalus output can be read using standard HDF5 tools or the Dedalus post-processing utilities.

Using h5py Directly

import h5py
import numpy as np
import matplotlib.pyplot as plt

# Open file
with h5py.File('snapshots/snapshots_s1.h5', 'r') as f:
    # Read time data
    sim_time = f['scales/sim_time'][:]
    
    # Read field data
    velocity = f['tasks/velocity'][:]  # Shape: (writes, 3, Nx, Ny, Nz)
    pressure = f['tasks/pressure'][:]  # Shape: (writes, Nx, Ny, Nz)
    
    # Read coordinate grids
    x = f['scales/x/1.0'][:]
    y = f['scales/y/1.0'][:]
    z = f['scales/z/1.0'][:]
    
    # Plot final snapshot
    plt.contourf(x, z, pressure[-1, :, 0, :].T)
    plt.colorbar(label='Pressure')
    plt.xlabel('x')
    plt.ylabel('z')
    plt.show()
Dedalus provides an xarray backend for convenient data analysis:
import xarray as xr
import numpy as np

# Open dataset
ds = xr.open_dataset('snapshots/snapshots_s1.h5', engine='dedalus')

# Examine structure
print(ds)

# Access data with labeled dimensions
velocity = ds['velocity']  # DataArray with coordinates
time = ds['sim_time']

# Select specific time
velocity_t10 = velocity.sel(sim_time=10.0, method='nearest')

# Slice in space
velocity_slice = velocity.isel(y=0)  # x-z slice at y=0

# Compute derived quantities
speed = np.sqrt((velocity**2).sum(dim='component'))

# Time average
velocity_mean = velocity.mean(dim='sim_time')

# Plot with automatic labeling
velocity_slice.isel(component=0, sim_time=-1).plot()
The xarray interface provides labeled arrays, automatic interpolation, and seamless integration with matplotlib and other analysis tools.

Merging Multiple Sets

Merge multiple sets into a continuous dataset:
import xarray as xr
from glob import glob

# Open all sets
files = sorted(glob('snapshots/snapshots_s*.h5'))
ds = xr.open_mfdataset(files, engine='dedalus', combine='by_coords')

# Now ds contains all writes from all sets
velocity = ds['velocity']  # Full time series

Post-Processing Virtual Files

After a parallel simulation with parallel='virtual', merge the distributed files:

Using Python

from dedalus.tools import post

# Merge virtual files into single HDF5 files
post.merge_virtual_analysis('snapshots', cleanup=False)

# Or merge with cleanup (removes distributed files)
post.merge_virtual_analysis('snapshots', cleanup=True)

Using Command Line

# Merge virtual files
python3 -m dedalus merge snapshots

# Merge specific sets
python3 -m dedalus merge snapshots/snapshots_s1.h5

# Parallel merging with MPI
mpiexec -n 4 python3 -m dedalus merge snapshots
Do not delete the distributed files (in snapshots_s1/ directories) until after merging!

Example: Complete Analysis Workflow

import numpy as np
import dedalus.public as d3
import xarray as xr
import matplotlib.pyplot as plt

# === During simulation ===
solver = problem.build_solver(d3.RK222)

# Field snapshots
snapshots = solver.evaluator.add_file_handler(
    'snapshots',
    sim_dt=0.25,
    max_writes=50,
    parallel='virtual'
)
snapshots.add_task(u, name='velocity')
snapshots.add_task(b, name='buoyancy')

# Scalar diagnostics
scalars = solver.evaluator.add_file_handler(
    'scalars',
    sim_dt=0.01,
    max_writes=1000
)
scalars.add_task(d3.integ(0.5*u@u), name='kinetic_energy')
scalars.add_task(d3.integ(0.5*b**2), name='potential_energy')

# Run simulation...

# === After simulation ===
# Merge virtual files
from dedalus.tools import post
post.merge_virtual_analysis('snapshots')

# Load and analyze
ds_snap = xr.open_mfdataset('snapshots/*.h5', engine='dedalus')
ds_scalar = xr.open_mfdataset('scalars/*.h5', engine='dedalus')

# Plot energy evolution
fig, ax = plt.subplots()
ax.plot(ds_scalar.sim_time, ds_scalar.kinetic_energy, label='Kinetic')
ax.plot(ds_scalar.sim_time, ds_scalar.potential_energy, label='Potential')
ax.set_xlabel('Time')
ax.set_ylabel('Energy')
ax.legend()
plt.savefig('energy.png')

# Create movie of buoyancy field
from matplotlib.animation import FuncAnimation

fig, ax = plt.subplots()
buoyancy = ds_snap['buoyancy'].isel(y=0)  # x-z slice

def animate(i):
    ax.clear()
    buoyancy.isel(sim_time=i).plot(ax=ax)
    ax.set_title(f"t = {float(buoyancy.sim_time[i]):.2f}")

anim = FuncAnimation(fig, animate, frames=len(buoyancy.sim_time), interval=100)
anim.save('buoyancy_evolution.mp4')

Configuration Options

Configure output behavior in dedalus.cfg:
[analysis]
# Default filehandler mode (overwrite, append)
FILEHANDLER_MODE_DEFAULT = overwrite

# Default filehandler parallel output method (gather, virtual, mpio)
FILEHANDLER_PARALLEL_DEFAULT = virtual

# Force filehandlers to touch a tmp file on each node
# Works around NFS caching issues
FILEHANDLER_TOUCH_TMPFILE = False
Or set at runtime:
from dedalus.tools.config import config
config['analysis']['FILEHANDLER_MODE_DEFAULT'] = 'append'
config['analysis']['FILEHANDLER_PARALLEL_DEFAULT'] = 'mpio'

Best Practices

Use Virtual I/O

For parallel runs, use parallel='virtual' for fast writes, then merge post-simulation.

Separate Handlers

Use different handlers for snapshots (infrequent, large) and diagnostics (frequent, small).

Limit max_writes

Set max_writes=50-100 to avoid huge files. Multiple sets are easy to handle.

Use xarray

The xarray interface provides powerful labeled arrays for analysis.

See Also

Build docs developers (and LLMs) love