Overview
PufferLib provides multiple vectorization backends to run environments in parallel. The two main classes areSerial (single-process) and Multiprocessing (multi-process).
make()
The recommended way to create vectorized environments:Function that creates a single environment instance. Can also be a list of callables (one per env).
Positional arguments to pass to env_creator. Can be a single list or a list of lists (one per env).
Keyword arguments to pass to env_creator. Can be a single dict or a list of dicts (one per env).
Total number of environment instances to create.
Number of worker processes (Multiprocessing only). Default is num_envs. Set to ‘auto’ for automatic selection.
Number of agents per batch. Default is num_envs. Must be divisible by (num_envs / num_workers).
Vectorization backend class:
Serial, Multiprocessing, or Ray. Default is PufferEnv (native single env).Use zero-copy shared memory (Multiprocessing only). Requires batch_size to divide num_envs evenly.
Synchronize trajectory collection across workers (Multiprocessing only).
Allow num_workers > CPU cores (Multiprocessing only). Not recommended.
Base random seed. Each worker gets seed + worker_id.
Class: Serial
Single-process vectorization. Runs all environments sequentially on one CPU core.When to use
- Debugging and development
- Very fast environments where multiprocessing overhead dominates
- Platforms without multiprocessing support
Initialization
List of environment creator functions, one per environment.
List of argument lists, one per environment.
List of keyword argument dicts, one per environment.
Number of environments to create.
Pre-allocated buffer dictionary (advanced usage).
Random seed for environments.
Properties
Total number of agents across all environments (same as agents_per_batch).
Number of agents returned per batch.
Total number of agents (same as agents_per_batch).
Observation space for a single agent.
Action space for a single agent.
Joint observation space for all agents in batch.
Joint action space for all agents in batch.
The first environment instance (useful for inspecting properties).
Whether environments use Gymnasium/PettingZoo emulation.
Methods
reset()
Random seed.
Initial observations for all agents.
Aggregated info dictionary.
step()
Actions for all agents.
Next observations.
Rewards.
Terminal flags.
Truncation flags.
Aggregated info dictionary.
async_reset()
send()
recv()
close()
Class: Multiprocessing
Multi-process vectorization with optimized shared memory for maximum performance.When to use
- Production training with multiple CPU cores
- CPU-intensive environments
- Maximum throughput requirements
Initialization
List of environment creator functions.
List of argument lists.
List of keyword argument dicts.
Total number of environments.
Number of worker processes. Default is num_envs. Must divide num_envs evenly.
Agents per batch. Default is num_envs.
Use zero-copy shared memory. Requires batch_size to divide num_envs evenly.
Synchronize trajectory collection.
Allow more workers than CPU cores.
Base random seed.
Properties
Total number of agents (same as agents_per_batch).
Total number of environment instances.
Number of worker processes.
Total number of agents across all environments.
Number of agents per batch.
Number of environments per worker process.
Number of workers per batch.
Observation space for a single agent.
Action space for a single agent.
Joint observation space for batch.
Joint action space for batch.
Environment instance for property inspection.
Whether environments use emulation.
Methods
Same as Serial:reset(), step(), async_reset(), send(), recv(), close().
Additionally:
notify()
Usage examples
Basic usage
Async interface
Different environments per worker
Performance tuning
autotune()
Automatically find optimal vectorization parameters:Function to create test environments.
Desired batch size.
Maximum number of environments to test.
Simulated model forward pass time.
Maximum RAM for environments.
Maximum VRAM per batch.
Seconds to run each test.