Skip to main content
The first step in the PARC pipeline is training a motion diffusion model (MDM) that generates kinematic motions conditioned on heightmaps and target directions.

Overview

The motion generator is a transformer-based diffusion model that learns to generate character motions based on:
  • Local heightmaps: Terrain geometry around the character
  • Target directions: Where the character should move
  • Previous states: For autoregressive generation
  • Contact labels: Ground contact information

Quick Start

1

Prepare your configuration

Create a YAML config file or use the default:
python scripts/parc_1_train_gen.py --config data/configs/parc_1_train_gen_default.yaml
If no config is provided, it defaults to data/configs/parc_1_train_gen.yaml.
2

Run training

python scripts/parc_1_train_gen.py --config path/to/your/config.yaml
The script will:
  • Create or load a motion sampler (cached as .pkl for faster loading)
  • Initialize the MDM model
  • Train using wandb for tracking (if enabled)
  • Save checkpoints periodically
  • Save the final model
3

Monitor training

If use_wandb: True in your config, view training progress at wandb.ai under project “train-mdm”.

Configuration Guide

Core Training Parameters

use_wandb: True              # Enable Weights & Biases logging
device: "cuda:0"             # GPU device
epochs: 100000               # Total training epochs
epochs_per_checkpoint: 2500  # Save checkpoint every N epochs
batch_size: 64               # Training batch size
iters_per_epoch: 50          # Iterations per epoch
lr: 0.00001                  # Learning rate
weight_decay: 0.01           # Weight decay for optimizer

Model Architecture

# Transformer configuration
d_model: 256        # Model dimension
num_heads: 8        # Attention heads
d_hid: 256          # Hidden dimension
num_layers: 4       # Number of transformer layers

# MLP layers
target_mlp_layers: [512]
in_mlp_layers: [256]
out_mlp_layers: [256]

Diffusion Settings

diffusion_timesteps: 21      # Number of diffusion steps
test_ddim_stride: 50         # DDIM stride for testing
test_mode: "MODE_DDIM"       # Sampling mode
predict_mode: "PREDICT_X0"   # Prediction mode

Dropout Rates

dropout: 0.1                              # General dropout
obs_dropout: 0.025                        # Observation dropout
target_dropout: 0.025                     # Target dropout
feature_vector_attention_dropout: 0.05    # Feature attention dropout
prev_state_attention_dropout: 0.15        # Previous state dropout

Motion Data Configuration

motion_lib_file: "path/to/motions.yaml"   # Dataset of motions
char_file: "data/assets/humanoid.xml"     # Character model
sequence_duration: 0.5                     # Window duration (seconds)
sequence_fps: 30                           # Frames per second

# Frame components to use
features:
  frame_components: ["ROOT_POS", "ROOT_ROT", "JOINT_POS", "JOINT_ROT", "CONTACTS"]
  rot_type: "DEFAULT"
  canonicalize_samples: True

Heightmap Configuration

use_heightmap_obs: True
use_hf_augmentation: True

heightmap:
  horizontal_scale: 0.2
  local_grid:
    num_x_neg: 10
    num_x_pos: 20
    num_y_neg: 15
    num_y_pos: 15
  max_h: 3.0

cnn:
  net_name: "cnn_31xy_4layer_c64_out64"

Target Direction Settings

use_target_obs: True
target_type: "XY_DIR"               # XY direction target
target_dir_len_eps: 0.1             # Length epsilon
target_dir_heading_eps: 0.5         # Heading epsilon
future_pos_noise_scale: 0.05        # Noise for augmentation
future_window_min: 0.4              # Min future window (seconds)
future_window_max: 1.5              # Max future window (seconds)

Loss Weights

train_loss_fn: "squared_l2"
test_loss_fn: "squared_l2"

# Geometric loss weights
w_simple_root_pos: 1.0
w_simple_root_rot: 1.0
w_simple_joint_rot: 0.5
w_simple_contacts: 1.0
w_simple_body_pos: 0.5
w_body_pos_consistency: 1.0
w_vel_root_pos: 0.5
w_vel_root_rot: 0.02
w_vel_joint_rot: 0.01
w_body_pos: 0.5
w_body_rot: 0.1
w_body_vel: 0.03
w_target: 0.02
w_hf: 15.0

Data Augmentation

# Heightfield augmentation
hf_augmentation_mode: "MAXPOOL_AND_BOXES"
max_num_boxes: 4
box_min_len: 2
box_max_len: 12
hf_maxpool_chance: 0.15
hf_max_maxpool_size: 10
hf_change_height_chance: 0.1

# Noise augmentation
angle_noise_scale: 0.01
pos_noise_scale: 0.01

Output Paths

sampler_save_filepath: "output/parc/sampler.pkl"
output_dir: "output/parc/train_gen/"

Important Files

  • parc/motion_generator/mdm.py - Main MDM class with heightmap and target conditions
  • parc/motion_generator/mdm_transformer.py - Transformer module implementation
  • parc/motion_generator/mdm_heightfield_contact_motion_sampler.py - Weighted dataset sampler

Continuing Training

To continue training from a checkpoint:
input_model_path: "path/to/checkpoint.ckpt"
The script will load the checkpoint and continue training with your current config settings.

Output Structure

After training, you’ll find:
output_dir/
├── checkpoints/
│   ├── model_2500.ckpt
│   ├── model_5000.ckpt
│   └── ...
└── final_model.ckpt

Troubleshooting

Sampler Loading is Slow

The first time you run training, creating the sampler can take several minutes. The sampler is saved to sampler_save_filepath and will be loaded much faster on subsequent runs.

Out of Memory

Reduce these parameters:
  • batch_size - Try 32 or 16
  • num_heads - Try 4
  • d_model - Try 128

Training Not Improving

Check:
  • Loss weights - Ensure w_hf and other weights are balanced
  • Learning rate - Try 0.00005 or 0.000005
  • Data augmentation - May need adjustment for your dataset

Next Steps

Once training is complete, use the trained model for motion synthesis.

Build docs developers (and LLMs) love