Skip to main content

Overview

PufferEnv is the base class for creating native vectorized environments in PufferLib. Unlike traditional Gym/Gymnasium environments that operate on single agents, PufferEnv handles multiple agents simultaneously for maximum performance.

Class: PufferEnv

Required attributes

Before calling super().__init__(), your environment must define:
single_observation_space
pufferlib.spaces.Box
required
The observation space for a single agent. Must be a Box space.
single_action_space
pufferlib.spaces.Discrete | pufferlib.spaces.MultiDiscrete | pufferlib.spaces.Box
required
The action space for a single agent. Must be Discrete, MultiDiscrete, or Box.
num_agents
int
required
Number of agents in the environment. Must be >= 1.

Initialization

class PufferEnv:
    def __init__(self, buf=None):
buf
dict | None
Optional pre-allocated buffer dictionary containing numpy arrays for observations, rewards, terminals, truncations, masks, and actions. Used internally for zero-copy vectorization.

Properties

After initialization, PufferEnv provides:
observations
numpy.ndarray
Array of shape (num_agents, *obs_shape) containing current observations
rewards
numpy.ndarray
Array of shape (num_agents,) containing rewards
terminals
numpy.ndarray
Boolean array of shape (num_agents,) indicating terminal states
truncations
numpy.ndarray
Boolean array of shape (num_agents,) indicating truncated episodes
masks
numpy.ndarray
Boolean array of shape (num_agents,) indicating active agents
actions
numpy.ndarray
Array of shape (num_agents, *action_shape) for storing actions
observation_space
gymnasium.spaces.Space
Joint observation space for all agents (automatically created from single_observation_space)
action_space
gymnasium.spaces.Space
Joint action space for all agents (automatically created from single_action_space)
agent_ids
numpy.ndarray
Array of agent IDs (0 to num_agents-1)
emulated
bool
Always False for native environments. Indicates whether the environment uses emulation.
done
bool
Always False for native environments. Native envs handle resets internally.
driver_env
PufferEnv
Returns self. Used for compatibility with Multiprocessing.

Methods

reset()

def reset(self, seed=None) -> tuple[numpy.ndarray, list[dict]]
Reset the environment and return initial observations.
seed
int | None
Random seed for reproducibility
observations
numpy.ndarray
Initial observations (written to self.observations)
infos
list[dict]
List of info dicts, one per agent
You must implement this method in your subclass.

step()

def step(self, actions) -> tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, list[dict]]
Execute one environment step.
actions
numpy.ndarray
required
Actions for all agents, shape (num_agents,) or (num_agents, *action_shape)
observations
numpy.ndarray
Next observations (written to self.observations)
rewards
numpy.ndarray
Rewards for each agent (written to self.rewards)
terminals
numpy.ndarray
Terminal flags for each agent (written to self.terminals)
truncations
numpy.ndarray
Truncation flags for each agent (written to self.truncations)
infos
list[dict]
List of info dicts, one per agent
You must implement this method in your subclass.

close()

def close()
Clean up environment resources.
You must implement this method in your subclass.

Async interface

PufferEnv provides an async-style interface for advanced vectorization:

async_reset()

def async_reset(self, seed=None)
Asynchronously reset the environment. Calls reset() internally and stores infos.
seed
int | None
Random seed for reproducibility

send()

def send(self, actions)
Send actions to the environment without waiting for the result.
actions
numpy.ndarray
required
Actions for all agents

recv()

def recv() -> tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, list[dict], numpy.ndarray, numpy.ndarray]
Retrieve the results of the last step.
observations
numpy.ndarray
Current observations
rewards
numpy.ndarray
Current rewards
terminals
numpy.ndarray
Terminal flags
truncations
numpy.ndarray
Truncation flags
infos
list[dict]
Info dictionaries
agent_ids
numpy.ndarray
Agent IDs
masks
numpy.ndarray
Active agent masks

Example usage

import numpy as np
import pufferlib
import pufferlib.spaces

class MyEnv(pufferlib.PufferEnv):
    def __init__(self, buf=None):
        # Define required attributes BEFORE calling super()
        self.single_observation_space = pufferlib.spaces.Box(
            low=0, high=255, shape=(84, 84, 3), dtype=np.uint8
        )
        self.single_action_space = pufferlib.spaces.Discrete(4)
        self.num_agents = 8
        
        # Now call super() to initialize buffers
        super().__init__(buf)
    
    def reset(self, seed=None):
        # Reset environment state
        self.observations[:] = self.single_observation_space.sample()
        infos = [{} for _ in range(self.num_agents)]
        return self.observations, infos
    
    def step(self, actions):
        # Execute environment step
        self.observations[:] = self.single_observation_space.sample()
        self.rewards[:] = np.random.randn(self.num_agents)
        self.terminals[:] = False
        self.truncations[:] = False
        infos = [{} for _ in range(self.num_agents)]
        return self.observations, self.rewards, self.terminals, self.truncations, infos
    
    def close(self):
        pass

# Usage
env = MyEnv()
observations, infos = env.reset()

for _ in range(100):
    actions = env.action_space.sample()
    obs, rewards, terminals, truncations, infos = env.step(actions)

env.close()

Common errors

APIUsageError: Environment missing required attributeThis error occurs when you call super().__init__() before defining single_observation_space, single_action_space, or num_agents. Always define these attributes first.
APIUsageError: PufferEnvs must define single_observation_space, not observation_spaceDo not define observation_space or action_space directly. PufferLib creates these automatically from your single-agent spaces.
APIUsageError: Native observation_space must be a BoxPufferEnv only supports Box observation spaces. If you need discrete observations, convert them to a Box representation.

Build docs developers (and LLMs) love