Installation Issues
Python version incompatibility
Python version incompatibility
Problem: Import errors or syntax errors after installation.Solution: LeRobot requires Python ≥3.12. Check your version:If you have an older version, create a new environment:
ffmpeg not found or missing libsvtav1
ffmpeg not found or missing libsvtav1
Problem: Errors like
ffmpeg not found or Encoder 'libsvtav1' not found.Solution: Install ffmpeg with libsvtav1 support:WSL (Windows) installation issues
WSL (Windows) installation issues
Problem: Errors related to evdev or input devices on Windows Subsystem for Linux.Solution: Install evdev explicitly:
Permission denied errors
Permission denied errors
Problem: Permission errors when installing packages.Solution: Don’t use sudo with pip. Instead:
GPU and CUDA Issues
CUDA out of memory errors
CUDA out of memory errors
Problem:
RuntimeError: CUDA out of memory during training or inference.Solutions:- Reduce batch size:
- Enable gradient accumulation:
- Use mixed precision (AMP):
- Clear CUDA cache:
- Use a smaller model variant or reduce sequence length
CUDA not available
CUDA not available
Problem:
torch.cuda.is_available() returns False.Solutions:- Check NVIDIA driver:
- Reinstall PyTorch with CUDA:
- Verify installation:
Multi-GPU training issues
Multi-GPU training issues
Problem: Errors when using multiple GPUs with DDP.Solution: Use torchrun with correct configuration:
Dataset Issues
Dataset not found on Hub
Dataset not found on Hub
Problem:
FileNotFoundError or DatasetNotFoundError when loading dataset.Solutions:- Verify dataset exists:
- Check authentication (for private datasets):
- Use correct repo_id format:
Video decoding errors
Video decoding errors
Problem: Errors when loading video frames from dataset.Solutions:
- Verify ffmpeg installation:
- Clear dataset cache and re-download:
- Check disk space:
Slow dataset loading
Slow dataset loading
Problem: Dataset loading takes too long.Solutions:
- Use streaming for large datasets:
- Increase number of workers:
- Cache dataset locally for repeated use
Corrupted dataset cache
Corrupted dataset cache
Problem: Inconsistent data or errors after dataset updates.Solution: Clear the dataset cache:
Training Issues
Training loss not decreasing
Training loss not decreasing
Problem: Loss plateaus or doesn’t decrease during training.Solutions:
- Check learning rate:
- Verify data normalization:
- Increase training steps:
- Check for data issues (e.g., all actions similar)
NaN or Inf in loss
NaN or Inf in loss
Problem: Loss becomes NaN or Inf during training.Solutions:
- Reduce learning rate:
- Enable gradient clipping:
- Check for numerical instability in custom code
- Verify dataset doesn’t contain NaN values:
Checkpoint loading errors
Checkpoint loading errors
Problem: Cannot resume training from checkpoint.Solutions:
- Verify checkpoint path:
- Check version compatibility:
- Ensure config matches: The checkpoint config must match your current training config
Robot Hardware Issues
Robot connection failed
Robot connection failed
Problem: Cannot connect to robot.Solutions:
- Check device permissions:
- Verify device path:
- Check cable connections and power supply
Latency in real-time control
Latency in real-time control
Problem: High latency causes jerky or delayed robot motion.Solutions:
- Use GPU inference:
-
Enable async inference:
See
examples/tutorial/async-inf/for policy server/client pattern - Optimize observation processing:
- Reduce image resolution
- Use hardware video encoding
- Minimize preprocessing steps
- Use action chunking (ACT-style policies reduce inference frequency)
Calibration issues
Calibration issues
Problem: Robot movements are offset or incorrect.Solutions:
- Re-run calibration: Follow your robot’s specific calibration procedure
- Check for breaking changes: See Backward Compatibility for migration guides
- Verify joint limits in robot config
- Test with known-good trajectory to isolate issue
Performance Optimization
Slow training
Slow training
Solutions:
- Use GPU acceleration
- Increase batch size (if memory allows)
- Use more DataLoader workers:
- Enable AMP (automatic mixed precision):
- Use multi-GPU training with DDP
High memory usage
High memory usage
Solutions:
- Reduce batch size
- Use gradient checkpointing:
- Clear unused tensors:
- Use streaming datasets for large data
Error Messages Reference
'normalize_inputs' not found in state_dict
'normalize_inputs' not found in state_dict
Cause: Loading a model trained before PR #1452 with new code.Solution: Migrate the model using the normalization migration script:See Backward Compatibility for details.
'Encoder libsvtav1 not found'
'Encoder libsvtav1 not found'
Cause: ffmpeg doesn’t have libsvtav1 encoder compiled.Solution: Install correct ffmpeg version:
'ImportError: cannot import name X'
'ImportError: cannot import name X'
Cause: Version mismatch between installed LeRobot and code.Solution:
Getting Help
If you can’t find a solution here:Search GitHub Issues
Check if your issue has been reported
Ask on Discord
Get help from the community
Open an Issue
Report a new bug with details
Discussions
Ask questions and share ideas