The lerobot-dataset-viz command visualizes all frames in a dataset episode using Rerun.
Command
lerobot-dataset-viz [OPTIONS]
Location: src/lerobot/scripts/lerobot_dataset_viz.py
Overview
The visualization script:
- Displays all data modalities (images, states, actions)
- Shows temporal evolution across episodes
- Useful for dataset inspection and debugging
- Supports local and remote viewing
- Can save visualization recordings
Key Options
Dataset repository ID (e.g., lerobot/pusht).
Episode index to visualize (0-based).
Batch size for data loading.
Number of dataloader workers.
Viewing mode: local or distant.
Web viewer port for distant mode.
gRPC port for distant mode.
Save visualization to .rrd file.
Directory for saving .rrd files.
--display-compressed-images
Compress images in Rerun (reduces bandwidth).
Usage Examples
Local Visualization
lerobot-dataset-viz \
--repo-id=lerobot/pusht \
--episode-index=0
This opens Rerun viewer locally showing:
- Camera images
- Action trajectories
- State values
- Rewards
- Episode metadata
Visualize Specific Episode
lerobot-dataset-viz \
--repo-id=lerobot/aloha_sim_insertion_human \
--episode-index=42
Save Visualization to File
lerobot-dataset-viz \
--repo-id=lerobot/pusht \
--episode-index=0 \
--save=1 \
--output-dir=./visualizations
# View saved file
rerun ./visualizations/lerobot_pusht_episode_0.rrd
Remote Visualization (Server)
On remote machine:
lerobot-dataset-viz \
--repo-id=lerobot/pusht \
--episode-index=0 \
--mode=distant \
--grpc-port=9876
On local machine:
rerun rerun+http://<SERVER_IP>:9876/proxy
Local Dataset Visualization
lerobot-dataset-viz \
--repo-id=myuser/my_dataset \
--root=./local_dataset \
--episode-index=0
lerobot-dataset-viz \
--repo-id=lerobot/pusht \
--episode-index=0 \
--batch-size=64 \
--num-workers=4
Compressed Images (Low Bandwidth)
lerobot-dataset-viz \
--repo-id=lerobot/pusht \
--episode-index=0 \
--display-compressed-images=true
Visualization Features
The Rerun viewer shows:
Camera Views
- All camera streams from the episode
- Synchronized playback
- Pan/zoom controls
Action Space
- Action values over time
- Multi-dimensional action plots
- Action bounds visualization
State Space
- Robot state evolution
- Joint positions/velocities
- End-effector positions
- Frame indices
- Timestamps
- Episode information
- Task descriptions
Programmatic Usage
from lerobot.scripts.lerobot_dataset_viz import visualize_dataset
from lerobot.datasets import LeRobotDataset
from pathlib import Path
dataset = LeRobotDataset(
"lerobot/pusht",
episodes=[0]
)
visualize_dataset(
dataset=dataset,
episode_index=0,
batch_size=32,
num_workers=0,
mode="local",
save=False,
output_dir=None,
)
Custom Visualization
import rerun as rr
from lerobot.datasets import LeRobotDataset
import numpy as np
dataset = LeRobotDataset("lerobot/pusht", episodes=[0])
# Initialize Rerun
rr.init("my_visualization", spawn=True)
# Visualize frames
for idx in range(len(dataset)):
frame = dataset[idx]
if frame["episode_index"].item() != 0:
break
# Set timeline
rr.set_time("frame", idx)
rr.set_time("timestamp", frame["timestamp"].item())
# Log camera images
for cam_key in dataset.meta.camera_keys:
img = frame[cam_key].permute(1, 2, 0).numpy() # CHW -> HWC
img = (img * 255).astype(np.uint8)
rr.log(cam_key, rr.Image(img))
# Log actions
action = frame["action"].numpy()
for i, name in enumerate(dataset.features["action"]["names"]):
rr.log(f"action/{name}", rr.Scalar(action[i]))
# Log state
if "observation.state" in frame:
state = frame["observation.state"].numpy()
for i, name in enumerate(dataset.features["observation.state"]["names"]):
rr.log(f"state/{name}", rr.Scalar(state[i]))
Compare Multiple Episodes
import rerun as rr
from lerobot.datasets import LeRobotDataset
dataset = LeRobotDataset("lerobot/pusht")
rr.init("multi_episode_comparison", spawn=True)
episode_indices = [0, 1, 2]
for ep_idx in episode_indices:
# Filter to specific episode
ep_frames = [
dataset[i] for i in range(len(dataset))
if dataset[i]["episode_index"].item() == ep_idx
]
for frame_idx, frame in enumerate(ep_frames):
rr.set_time("episode", ep_idx)
rr.set_time("frame", frame_idx)
# Log data under episode-specific path
cam_key = dataset.meta.camera_keys[0]
img = frame[cam_key].permute(1, 2, 0).numpy()
img = (img * 255).astype(np.uint8)
rr.log(f"episode_{ep_idx}/{cam_key}", rr.Image(img))
Rerun Viewer Controls
- Play/Pause: Spacebar
- Step Forward/Back: Arrow keys
- Zoom: Mouse wheel
- Pan: Click and drag
- Select View: Click panel names
- Timeline: Drag timeline slider
When using --save=1, creates .rrd files:
output-dir/
└── {repo_id}_{episode_index}.rrd
These files can be:
- Shared with collaborators
- Viewed offline with
rerun <file>.rrd
- Archived for later inspection
Tips
- Large Episodes: Use higher
--batch-size for faster loading
- Remote Viewing: Use distant mode for headless servers
- Bandwidth: Enable
--display-compressed-images for remote viewing
- Debugging: Useful for verifying dataset quality before training
- Comparison: Save multiple episodes and compare offline
See Also