Overview
The tracking controller:- Uses reinforcement learning (PPO) to train a policy
- Tracks reference motions from your dataset
- Handles multiple terrains in parallel environments
- Outputs a physics-based motion controller
Quick Start
Configuration Guide
Main Training Config
Environment Configuration
The environment config (dm_env_default.yaml) controls the simulation:
Key settings:
- Motion sampling strategy
- Terrain loading and generation
- Episode length and termination conditions
- Observation and action spaces
- Reward function weights
Agent Configuration
The agent config (dm_agent_default.yaml) defines the RL algorithm:
Key settings:
- Neural network architecture
- PPO hyperparameters
- Learning rates
- Normalization settings
Training Process
The training script (parc_3_tracker.py):
- Loads or creates the dataset configuration
- Modifies environment config with dataset path
- Saves modified configs to output directory
- Calls the tracker training with appropriate arguments
- Saves checkpoints and final model
Output Structure
Understanding the Training Environment
The training environment (parc/motion_tracker/envs/ig_parkour/dm_env.py):
- Loads reference motions and terrains in parallel
- Samples random starting frames for each episode
- Computes rewards based on tracking error
- Handles early termination on tracking failure
- Arranges terrains in a grid to avoid numerical issues
Continuing Training
To continue from a checkpoint:normalizer_samples: 0 in your agent config when continuing training, as normalization statistics are already computed.
Important Files
Implementation details:scripts/parc_3_tracker.py- Training launcher scriptparc/motion_tracker/run_tracker.py- Main training loopparc/motion_tracker/envs/ig_parkour/dm_env.py- Tracking environmentparc/motion_tracker/envs/ig_parkour/ig_parkour_env.py- Parent environment wrapper
Troubleshooting
Out of Memory
Symptoms: CUDA out of memory errors Solutions:- Reduce
num_envs(try 1024 or 512) - Reduce network size in agent config
- Reduce observation history length
- Use a GPU with more memory
Training Not Converging
Symptoms: Reward not improving, high termination rate Solutions:- Check reward weights in environment config
- Reduce
learning_ratein agent config - Increase episode length for more exploration
- Verify reference motions are physically plausible
- Start with simpler terrains before complex ones
Checkpoints Not Saving
Symptoms: No files incheckpoints/ directory
Solutions:
- Check
output_dirpath is writable - Verify checkpoint saving frequency in agent config
- Ensure training runs long enough to trigger first checkpoint
Agent Falls Immediately
Symptoms: Very short episodes, low reward from start Solutions:- Check initial pose matches reference motion
- Verify terrain is loaded correctly
- Increase exploration early in training
- Check PD controller gains in environment config
Cannot Load Dataset
Symptoms: Error opening dataset file Solutions:- Verify
dataset_filepath is correct - If using
create_dataset_config, ensure that config is valid - Check that motion
.pklfiles exist and are accessible - Use absolute paths instead of relative paths
Advanced: Multi-Iteration Training
In the PARC loop, each iteration:- Trains a new generator on expanded dataset
- Generates new motions
- Trains tracker on new + old motions
- Records physics-based motions
- Adds recorded motions to dataset
Performance Tips
- GPU memory: More
num_envs= faster training but more memory - CPU cores: Isaac Gym uses CPU for environment stepping
- Checkpoint frequency: Save every 10K-50K samples to balance disk usage and recovery
- Terrain complexity: Start simple, gradually increase difficulty