Skip to main content
🤗 PEFT (Parameter-Efficient Fine-Tuning) enables efficient adaptation of large pre-trained models by training only a small number of additional parameters. This is especially useful for fine-tuning large Vision-Language-Action (VLA) models like SmolVLA, π₀, and GR00T.

What is PEFT?

PEFT methods add trainable adapter modules to a frozen pre-trained model. Instead of fine-tuning all billions of parameters, you train only millions of adapter parameters:
  • Full Fine-tuning: Update all 7B parameters
  • LoRA (rank=64): Update only ~100M adapter parameters (1.4% of total)
  • Result: Similar performance with much less compute and memory

Installation

Install LeRobot with PEFT support:
pip install lerobot[peft]
Or install PEFT separately:
pip install peft

Quick Start

Fine-tune SmolVLA with LoRA:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_pickplace \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4
Key differences from full fine-tuning:
  • --policy.path: Load pre-trained model
  • --peft.method_type=LORA: Use LoRA adapters
  • --peft.r=64: LoRA rank (higher = more parameters)
  • Higher learning rate (1e-3 vs 1e-4 for full fine-tuning)

Supported Methods

LoRA (Low-Rank Adaptation)

LoRA is the most popular PEFT method. It adds low-rank matrices to attention layers:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1
Parameters:
  • r: Rank of adapter matrices (higher = more capacity)
    • r=8: Very lightweight (~25M params)
    • r=32: Balanced (~50M params)
    • r=64: High capacity (~100M params)
  • lora_alpha: Scaling factor (typically r/2 or r/4)
  • lora_dropout: Dropout rate for adapters
When to use: General purpose fine-tuning, good balance of efficiency and performance.

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

IA³ uses even fewer parameters by learning scaling factors:
lerobot-train \
  --policy.path=lerobot/pi0_base \
  --peft.method_type=IA3
When to use: When you have very limited compute or want the smallest possible adapter.

AdaLoRA (Adaptive LoRA)

Adaptively allocates rank across different layers:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=ADALORA \
  --peft.target_r=8 \
  --peft.init_r=12
When to use: When you want to automatically find the optimal rank distribution.

Targeting Modules

Default Targets

By default, LoRA targets attention projection layers and task-specific heads:
# For SmolVLA
default_targets = [
    "q_proj",  # Query projection
    "v_proj",  # Value projection  
    "state_proj",  # State encoder
    "action_in_proj",  # Action encoder
    "action_out_proj",  # Action decoder
]

Custom Targets

Specify custom modules to adapt:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj","k_proj","o_proj"]'

Using Regex

Target modules with regex patterns:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.target_modules='(model\.vlm_with_expert\.lm_expert\..*\.(down|gate|up)_proj|.*\.(state_proj|action_in_proj|action_out_proj))'
This targets:
  • All MLP layers in the language model expert
  • State and action projection layers

Finding Module Names

Print model architecture to find module names:
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")

# Print all module names
for name, module in policy.named_modules():
    print(name)

Full Fine-tuning Specific Modules

For some modules, you may want full fine-tuning instead of adapters:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj"]' \
  --peft.full_training_modules='["state_proj","action_out_proj"]'
This:
  • Adds LoRA adapters to attention layers
  • Fully fine-tunes state and action projections

Fine-tuning SmolVLA

Complete example for fine-tuning SmolVLA on a manipulation task:
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_libero_spatial \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --policy.output_features=null \
  --policy.input_features=null \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4 \
  --env.type=libero \
  --env.task=libero_spatial \
  --steps=100000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1 \
  --eval_freq=10000 \
  --save_freq=10000 \
  --log_freq=100
Key settings:
  • output_features=null, input_features=null: Auto-infer from dataset
  • Learning rate 10x higher than full fine-tuning
  • Batch size 32 (adjust based on GPU memory)
  • Evaluate every 10k steps

Fine-tuning π₀

Fine-tune Physical Intelligence’s π₀ policy:
lerobot-train \
  --policy.path=lerobot/pi0_base \
  --policy.repo_id=your_username/pi0_aloha_insertion \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=16 \
  --peft.method_type=LORA \
  --peft.r=32 \
  --policy.optimizer_lr=5e-4

Memory and Speed Benefits

Memory Usage

PEFT drastically reduces memory requirements:
MethodTrainable ParamsMemory (fp16)Speedup
Full Fine-tuning7B~28 GB1.0x
LoRA (r=64)100M~16 GB1.8x
LoRA (r=32)50M~14 GB2.0x
LoRA (r=8)25M~12 GB2.2x

Training Speed

PEFT training is faster because:
  • Fewer gradients to compute
  • Less memory movement
  • Faster optimizer updates
Typical speedup: 1.5-2x compared to full fine-tuning.

Hyperparameter Tuning

Learning Rate

PEFT typically uses higher learning rates:
# Full fine-tuning
--policy.optimizer_lr=1e-4

# LoRA fine-tuning
--policy.optimizer_lr=1e-3  # 10x higher
Start with 5-10x the full fine-tuning learning rate.

LoRA Rank

Balance between capacity and efficiency:
# Lightweight (good for small datasets)
--peft.r=8

# Balanced (recommended default)
--peft.r=32

# High capacity (for complex tasks)
--peft.r=64

# Very high capacity (approaching full fine-tuning)
--peft.r=128

LoRA Alpha

Scaling factor for adapter outputs:
# Conservative (less adapter influence)
--peft.lora_alpha=8

# Balanced (recommended: r/2)
--peft.r=64 --peft.lora_alpha=32

# Aggressive (more adapter influence)
--peft.lora_alpha=64
Rule of thumb: Set lora_alpha = r/2 or r/4.

Loading PEFT Models

Load fine-tuned PEFT models:
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load base model with adapters
policy = SmolVLAPolicy.from_pretrained(
    "your_username/smolvla_finetuned",
    use_peft=True
)

policy.eval()
action = policy.select_action(observation)
PEFT adapters are stored alongside the base model weights.

Merging Adapters

Merge adapters into base model for deployment:
from peft import PeftModel
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load model with adapter
policy = SmolVLAPolicy.from_pretrained(
    "lerobot/smolvla_base",
    use_peft=True
)
policy = PeftModel.from_pretrained(policy, "your_username/adapter")

# Merge adapter weights into base model
policy = policy.merge_and_unload()

# Save merged model
policy.save_pretrained("merged_model")
Merged models:
  • Load faster (no adapter overhead)
  • Use slightly less memory
  • Cannot be “un-merged”

Multi-GPU PEFT Training

Scale PEFT training across GPUs:
accelerate launch --num_processes=4 \
  -m lerobot.scripts.lerobot_train \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --peft.method_type=LORA \
  --peft.r=64 \
  --batch_size=32 \
  --steps=100000
PEFT is very efficient for multi-GPU training due to reduced memory and communication overhead.

Best Practices

1
Start with default settings
2
Use recommended defaults:
3
lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=32 \
  --policy.optimizer_lr=1e-3
4
Use higher learning rates
5
PEFT converges faster with higher learning rates:
6
# Full fine-tuning: 1e-4
# PEFT: 5e-4 to 1e-3
--policy.optimizer_lr=1e-3
7
Monitor validation loss
8
PEFT can overfit more easily:
9
lerobot-train \
  --peft.method_type=LORA \
  --dataset.train_fraction=0.9 \
  --eval_freq=5000
10
Start with smaller rank
11
Begin with r=32 and increase if needed:
12
# Try r=32 first
--peft.r=32

# If underfitting, increase
--peft.r=64
13
Match training data scale to rank
14
Smaller datasets need smaller ranks:
15
  • < 100 episodes: r=8-16
  • 100-500 episodes: r=32
  • 500+ episodes: r=64
  • Troubleshooting

    Model not learning

    Increase learning rate or rank:
    --policy.optimizer_lr=5e-3 --peft.r=128
    

    Out of memory

    Reduce rank or batch size:
    --peft.r=16 --batch_size=16
    

    Overfitting

    Reduce rank or add dropout:
    --peft.r=32 --peft.lora_dropout=0.2
    

    Next Steps

    Build docs developers (and LLMs) love