PEFT Training

🤗 PEFT (Parameter-Efficient Fine-Tuning) enables efficient adaptation of large pre-trained models by training only a small number of additional parameters. This is especially useful for fine-tuning large Vision-Language-Action (VLA) models like SmolVLA, π₀, and GR00T.

What is PEFT?

PEFT methods add trainable adapter modules to a frozen pre-trained model. Instead of fine-tuning all billions of parameters, you train only millions of adapter parameters:

Full Fine-tuning: Update all 7B parameters
LoRA (rank=64): Update only ~100M adapter parameters (1.4% of total)
Result: Similar performance with much less compute and memory

Installation

Install LeRobot with PEFT support:

pip install lerobot[peft]

Or install PEFT separately:

pip install peft

Quick Start

Fine-tune SmolVLA with LoRA:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_pickplace \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4

Key differences from full fine-tuning:

--policy.path: Load pre-trained model
--peft.method_type=LORA: Use LoRA adapters
--peft.r=64: LoRA rank (higher = more parameters)
Higher learning rate (1e-3 vs 1e-4 for full fine-tuning)

Supported Methods

LoRA (Low-Rank Adaptation)

LoRA is the most popular PEFT method. It adds low-rank matrices to attention layers:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1

Parameters:

r: Rank of adapter matrices (higher = more capacity)
- r=8: Very lightweight (~25M params)
- r=32: Balanced (~50M params)
- r=64: High capacity (~100M params)
lora_alpha: Scaling factor (typically r/2 or r/4)
lora_dropout: Dropout rate for adapters

When to use: General purpose fine-tuning, good balance of efficiency and performance.

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

IA³ uses even fewer parameters by learning scaling factors:

lerobot-train \
  --policy.path=lerobot/pi0_base \
  --peft.method_type=IA3

When to use: When you have very limited compute or want the smallest possible adapter.

AdaLoRA (Adaptive LoRA)

Adaptively allocates rank across different layers:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=ADALORA \
  --peft.target_r=8 \
  --peft.init_r=12

When to use: When you want to automatically find the optimal rank distribution.

Targeting Modules

Default Targets

By default, LoRA targets attention projection layers and task-specific heads:

# For SmolVLA
default_targets = [
    "q_proj",  # Query projection
    "v_proj",  # Value projection  
    "state_proj",  # State encoder
    "action_in_proj",  # Action encoder
    "action_out_proj",  # Action decoder
]

Custom Targets

Specify custom modules to adapt:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj","k_proj","o_proj"]'

Using Regex

Target modules with regex patterns:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.target_modules='(model\.vlm_with_expert\.lm_expert\..*\.(down|gate|up)_proj|.*\.(state_proj|action_in_proj|action_out_proj))'

This targets:

All MLP layers in the language model expert
State and action projection layers

Finding Module Names

Print model architecture to find module names:

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")

# Print all module names
for name, module in policy.named_modules():
    print(name)

Full Fine-tuning Specific Modules

For some modules, you may want full fine-tuning instead of adapters:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj"]' \
  --peft.full_training_modules='["state_proj","action_out_proj"]'

This:

Adds LoRA adapters to attention layers
Fully fine-tunes state and action projections

Fine-tuning SmolVLA

Complete example for fine-tuning SmolVLA on a manipulation task:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_libero_spatial \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --policy.output_features=null \
  --policy.input_features=null \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4 \
  --env.type=libero \
  --env.task=libero_spatial \
  --steps=100000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1 \
  --eval_freq=10000 \
  --save_freq=10000 \
  --log_freq=100

Key settings:

output_features=null, input_features=null: Auto-infer from dataset
Learning rate 10x higher than full fine-tuning
Batch size 32 (adjust based on GPU memory)
Evaluate every 10k steps

Fine-tuning π₀

Fine-tune Physical Intelligence’s π₀ policy:

lerobot-train \
  --policy.path=lerobot/pi0_base \
  --policy.repo_id=your_username/pi0_aloha_insertion \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=16 \
  --peft.method_type=LORA \
  --peft.r=32 \
  --policy.optimizer_lr=5e-4

Memory and Speed Benefits

Memory Usage

PEFT drastically reduces memory requirements:

Method	Trainable Params	Memory (fp16)	Speedup
Full Fine-tuning	7B	~28 GB	1.0x
LoRA (r=64)	100M	~16 GB	1.8x
LoRA (r=32)	50M	~14 GB	2.0x
LoRA (r=8)	25M	~12 GB	2.2x

Training Speed

PEFT training is faster because:

Fewer gradients to compute
Less memory movement
Faster optimizer updates

Typical speedup: 1.5-2x compared to full fine-tuning.

Hyperparameter Tuning

Learning Rate

PEFT typically uses higher learning rates:

# Full fine-tuning
--policy.optimizer_lr=1e-4

# LoRA fine-tuning
--policy.optimizer_lr=1e-3  # 10x higher

Start with 5-10x the full fine-tuning learning rate.

LoRA Rank

Balance between capacity and efficiency:

# Lightweight (good for small datasets)
--peft.r=8

# Balanced (recommended default)
--peft.r=32

# High capacity (for complex tasks)
--peft.r=64

# Very high capacity (approaching full fine-tuning)
--peft.r=128

LoRA Alpha

Scaling factor for adapter outputs:

# Conservative (less adapter influence)
--peft.lora_alpha=8

# Balanced (recommended: r/2)
--peft.r=64 --peft.lora_alpha=32

# Aggressive (more adapter influence)
--peft.lora_alpha=64

Rule of thumb: Set lora_alpha = r/2 or r/4.

Loading PEFT Models

Load fine-tuned PEFT models:

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load base model with adapters
policy = SmolVLAPolicy.from_pretrained(
    "your_username/smolvla_finetuned",
    use_peft=True
)

policy.eval()
action = policy.select_action(observation)

PEFT adapters are stored alongside the base model weights.

Merging Adapters

Merge adapters into base model for deployment:

from peft import PeftModel
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load model with adapter
policy = SmolVLAPolicy.from_pretrained(
    "lerobot/smolvla_base",
    use_peft=True
)
policy = PeftModel.from_pretrained(policy, "your_username/adapter")

# Merge adapter weights into base model
policy = policy.merge_and_unload()

# Save merged model
policy.save_pretrained("merged_model")

Merged models:

Load faster (no adapter overhead)
Use slightly less memory
Cannot be “un-merged”

Multi-GPU PEFT Training

Scale PEFT training across GPUs:

accelerate launch --num_processes=4 \
  -m lerobot.scripts.lerobot_train \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --peft.method_type=LORA \
  --peft.r=64 \
  --batch_size=32 \
  --steps=100000

PEFT is very efficient for multi-GPU training due to reduced memory and communication overhead.

Best Practices

Start with default settings

Use recommended defaults:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=32 \
  --policy.optimizer_lr=1e-3

Use higher learning rates

PEFT converges faster with higher learning rates:

# Full fine-tuning: 1e-4
# PEFT: 5e-4 to 1e-3
--policy.optimizer_lr=1e-3

Monitor validation loss

PEFT can overfit more easily:

lerobot-train \
  --peft.method_type=LORA \
  --dataset.train_fraction=0.9 \
  --eval_freq=5000

Start with smaller rank

Begin with r=32 and increase if needed:

# Try r=32 first
--peft.r=32

# If underfitting, increase
--peft.r=64

Match training data scale to rank

Smaller datasets need smaller ranks:

< 100 episodes: r=8-16

100-500 episodes: r=32

500+ episodes: r=64

Troubleshooting

Model not learning

Increase learning rate or rank:

--policy.optimizer_lr=5e-3 --peft.r=128

Out of memory

Reduce rank or batch size:

--peft.r=16 --batch_size=16

Overfitting

Reduce rank or add dropout:

--peft.r=32 --peft.lora_dropout=0.2

Next Steps

Multi-GPU Training - Scale PEFT across GPUs
Evaluate Policies - Test your fine-tuned model
SmolVLA Guide - Learn about SmolVLA architecture
PEFT Documentation - Deep dive into PEFT methods

Get Started

Core Concepts

Tutorials

Datasets

Simulation

Inference

Advanced

What is PEFT?

Installation

Quick Start

Supported Methods

LoRA (Low-Rank Adaptation)

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

AdaLoRA (Adaptive LoRA)

Targeting Modules

Default Targets

Custom Targets

Using Regex

Finding Module Names

Full Fine-tuning Specific Modules

Fine-tuning SmolVLA

Fine-tuning π₀

Memory and Speed Benefits

Memory Usage

Training Speed

Hyperparameter Tuning

Learning Rate

LoRA Rank

LoRA Alpha

Loading PEFT Models

Merging Adapters

Multi-GPU PEFT Training

Best Practices

Troubleshooting

Model not learning

Out of memory

Overfitting

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Tutorials

Datasets

Simulation

Inference

Advanced

​What is PEFT?

​Installation

​Quick Start

​Supported Methods

​LoRA (Low-Rank Adaptation)

​IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

​AdaLoRA (Adaptive LoRA)

​Targeting Modules

​Default Targets

​Custom Targets

​Using Regex

​Finding Module Names

​Full Fine-tuning Specific Modules

​Fine-tuning SmolVLA

​Fine-tuning π₀

​Memory and Speed Benefits

​Memory Usage

​Training Speed

​Hyperparameter Tuning

​Learning Rate

​LoRA Rank

​LoRA Alpha

​Loading PEFT Models

​Merging Adapters

​Multi-GPU PEFT Training

​Best Practices

​Troubleshooting

​Model not learning

​Out of memory

​Overfitting

​Next Steps

Build docs developers (and LLMs) love

What is PEFT?

Installation

Quick Start

Supported Methods

LoRA (Low-Rank Adaptation)

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

AdaLoRA (Adaptive LoRA)

Targeting Modules

Default Targets

Custom Targets

Using Regex

Finding Module Names

Full Fine-tuning Specific Modules

Fine-tuning SmolVLA

Fine-tuning π₀

Memory and Speed Benefits

Memory Usage

Training Speed

Hyperparameter Tuning

Learning Rate

LoRA Rank

LoRA Alpha

Loading PEFT Models

Merging Adapters

Multi-GPU PEFT Training

Best Practices

Troubleshooting

Model not learning

Out of memory

Overfitting

Next Steps