Overview
RL training in OpenSandbox offers:- Isolated training runs - Each experiment runs in a clean container
- Reproducible environments - Consistent package versions and configurations
- Resource control - CPU, memory, and GPU allocation per training job
- Full observability - Capture logs, metrics, and checkpoints
- Distributed training - Scale across multiple sandboxes
- Dependency isolation - No conflicts between different RL frameworks
Quick Start
1. Start OpenSandbox Server
2. Run RL Training Example
examples/rl-training/
Training Script Example
The example uses Stable-Baselines3 to train a DQN agent on CartPole:Use Cases
Hyperparameter Tuning
Run parallel experiments with different hyperparameters:Multi-Environment Training
Train on multiple environments simultaneously:Checkpoint Management
Save and restore training checkpoints:TensorBoard Monitoring
Visualize training metrics with TensorBoard:Custom RL Frameworks
Train with different RL libraries:Ray RLlib
CleanRL
Environment Configuration
Environment Variables
Resource Allocation
GPU Support
Supported RL Frameworks
Stable-Baselines3
Ray RLlib
TF-Agents
CleanRL
Gymnasium Environments
Performance Optimization
Vectorized Environments
Parallel Training
Checkpointing Strategy
Troubleshooting
Dependency Installation Failed
Use the Python environment helper:Out of Memory
Increase memory limit:Training Timeout
Increase timeout or reduce timesteps:GPU Not Available
Verify GPU support:Best Practices
1. Use Ephemeral Sandboxes
2. Log Everything
3. Save Artifacts
4. Set Reproducible Seeds
Related Resources
RL Training Example
Complete RL training example with DQN
AI Coding Agents
AI agents for code generation
Python SDK
SDK reference documentation
API Reference
Full API documentation