Skip to main content
This guide provides solutions to common problems you might encounter while using LeRobot.

Installation Issues

Problem: Import errors or syntax errors after installation.Solution: LeRobot requires Python ≥3.12. Check your version:
python --version
# Should show Python 3.12.x or higher
If you have an older version, create a new environment:
conda create -y -n lerobot python=3.12
conda activate lerobot
pip install lerobot
Problem: Errors like ffmpeg not found or Encoder 'libsvtav1' not found.Solution: Install ffmpeg with libsvtav1 support:
# With conda (recommended)
conda install ffmpeg=7.1.1 -c conda-forge

# Verify installation
ffmpeg -version
ffmpeg -encoders | grep svt
ffmpeg 8.X is not yet supported. Use version 7.X.
Problem: Errors related to evdev or input devices on Windows Subsystem for Linux.Solution: Install evdev explicitly:
conda install evdev -c conda-forge
Problem: Permission errors when installing packages.Solution: Don’t use sudo with pip. Instead:
# Use virtual environment (recommended)
conda create -n lerobot python=3.12
conda activate lerobot
pip install lerobot

# Or install for user only
pip install --user lerobot

GPU and CUDA Issues

Problem: RuntimeError: CUDA out of memory during training or inference.Solutions:
  1. Reduce batch size:
lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.batch_size=8  # Try smaller values
  1. Enable gradient accumulation:
lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.batch_size=4 \
  --training.gradient_accumulation_steps=4
  1. Use mixed precision (AMP):
policy.config.use_amp = True
  1. Clear CUDA cache:
import torch
torch.cuda.empty_cache()
  1. Use a smaller model variant or reduce sequence length
Problem: torch.cuda.is_available() returns False.Solutions:
  1. Check NVIDIA driver:
nvidia-smi
  1. Reinstall PyTorch with CUDA:
# For CUDA 11.8
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
  1. Verify installation:
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
Problem: Errors when using multiple GPUs with DDP.Solution: Use torchrun with correct configuration:
# For 4 GPUs
torchrun --nproc_per_node=4 -m lerobot.scripts.train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht

# Ensure consistent batch size across GPUs
# Total batch size = batch_size * num_gpus

Dataset Issues

Problem: FileNotFoundError or DatasetNotFoundError when loading dataset.Solutions:
  1. Verify dataset exists:
from huggingface_hub import list_datasets

datasets = [d.id for d in list_datasets(task_categories="robotics", tags=["LeRobot"])]
print("Available datasets:", datasets)
  1. Check authentication (for private datasets):
huggingface-cli login
  1. Use correct repo_id format:
# Correct
dataset = LeRobotDataset("lerobot/pusht")

# Incorrect
dataset = LeRobotDataset("pusht")  # Missing namespace
Problem: Errors when loading video frames from dataset.Solutions:
  1. Verify ffmpeg installation:
ffmpeg -version
ffmpeg -decoders | grep h264
  1. Clear dataset cache and re-download:
rm -rf ~/.cache/huggingface/lerobot/<dataset-name>
  1. Check disk space:
df -h ~/.cache/huggingface
Problem: Dataset loading takes too long.Solutions:
  1. Use streaming for large datasets:
dataset = LeRobotDataset(
    "lerobot/aloha_mobile_cabinet",
    streaming=True  # Don't download entire dataset
)
  1. Increase number of workers:
from torch.utils.data import DataLoader

dataloader = DataLoader(
    dataset,
    batch_size=32,
    num_workers=8  # Increase for faster loading
)
  1. Cache dataset locally for repeated use
Problem: Inconsistent data or errors after dataset updates.Solution: Clear the dataset cache:
# Remove specific dataset
rm -rf ~/.cache/huggingface/lerobot/<dataset-name>

# Or clear all cached datasets
rm -rf ~/.cache/huggingface/lerobot/*

Training Issues

Problem: Loss plateaus or doesn’t decrease during training.Solutions:
  1. Check learning rate:
# Try different learning rates
config.training.lr = 1e-4  # Default
config.training.lr = 1e-3  # Higher for faster learning
config.training.lr = 1e-5  # Lower for stability
  1. Verify data normalization:
# Check dataset statistics
print(dataset.meta.stats)
  1. Increase training steps:
lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.num_steps=200000  # More steps
  1. Check for data issues (e.g., all actions similar)
Problem: Loss becomes NaN or Inf during training.Solutions:
  1. Reduce learning rate:
config.training.lr = 1e-5  # Lower LR
  1. Enable gradient clipping:
config.training.grad_clip_norm = 1.0
  1. Check for numerical instability in custom code
  2. Verify dataset doesn’t contain NaN values:
import torch
batch = next(iter(dataloader))
print("NaN in batch:", torch.isnan(batch['action']).any())
Problem: Cannot resume training from checkpoint.Solutions:
  1. Verify checkpoint path:
ls outputs/train/my_checkpoint/
# Should contain: config.yaml, checkpoint_*.pth
  1. Check version compatibility:
# Model from old version may not be compatible
# Try loading with strict=False
policy.load_state_dict(checkpoint, strict=False)
  1. Ensure config matches: The checkpoint config must match your current training config

Robot Hardware Issues

Problem: Cannot connect to robot.Solutions:
  1. Check device permissions:
# For USB devices
sudo chmod 666 /dev/ttyUSB0  # Or your device

# Add user to dialout group (permanent)
sudo usermod -a -G dialout $USER
# Log out and back in for changes to take effect
  1. Verify device path:
# List USB devices
ls /dev/tty*

# Use correct path in config
robot = Robot(port="/dev/ttyUSB0")
  1. Check cable connections and power supply
Problem: High latency causes jerky or delayed robot motion.Solutions:
  1. Use GPU inference:
policy = policy.to("cuda")
  1. Enable async inference: See examples/tutorial/async-inf/ for policy server/client pattern
  2. Optimize observation processing:
  • Reduce image resolution
  • Use hardware video encoding
  • Minimize preprocessing steps
  1. Use action chunking (ACT-style policies reduce inference frequency)
Problem: Robot movements are offset or incorrect.Solutions:
  1. Re-run calibration: Follow your robot’s specific calibration procedure
  2. Check for breaking changes: See Backward Compatibility for migration guides
  3. Verify joint limits in robot config
  4. Test with known-good trajectory to isolate issue

Performance Optimization

Solutions:
  1. Use GPU acceleration
  2. Increase batch size (if memory allows)
  3. Use more DataLoader workers:
dataloader = DataLoader(dataset, num_workers=8)
  1. Enable AMP (automatic mixed precision):
config.use_amp = True
  1. Use multi-GPU training with DDP
Solutions:
  1. Reduce batch size
  2. Use gradient checkpointing:
config.use_gradient_checkpointing = True
  1. Clear unused tensors:
del large_tensor
torch.cuda.empty_cache()
  1. Use streaming datasets for large data

Error Messages Reference

Cause: Loading a model trained before PR #1452 with new code.Solution: Migrate the model using the normalization migration script:
python src/lerobot/processor/migrate_policy_normalization.py \
    --pretrained-path your/model/path
See Backward Compatibility for details.
Cause: ffmpeg doesn’t have libsvtav1 encoder compiled.Solution: Install correct ffmpeg version:
conda install ffmpeg=7.1.1 -c conda-forge
ffmpeg -encoders | grep svt  # Verify
Cause: Version mismatch between installed LeRobot and code.Solution:
# Reinstall LeRobot
pip uninstall lerobot
pip install lerobot --upgrade

# Or reinstall from source
cd lerobot
pip install -e . --force-reinstall

Getting Help

If you can’t find a solution here:

Search GitHub Issues

Check if your issue has been reported

Ask on Discord

Get help from the community

Open an Issue

Report a new bug with details

Discussions

Ask questions and share ideas

Reporting Bugs

When reporting an issue, please include:
1

Environment Information

lerobot-info
python --version
nvidia-smi  # If using GPU
2

Minimal Reproduction

Provide the smallest code snippet that reproduces the issue
3

Error Traceback

Include the full error message and stack trace
4

Expected vs Actual

Describe what you expected to happen and what actually happened
The more details you provide, the faster we can help you!

Build docs developers (and LLMs) love