PufferEnv

Overview

PufferEnv is the base class for creating native vectorized environments in PufferLib. Unlike traditional Gym/Gymnasium environments that operate on single agents, PufferEnv handles multiple agents simultaneously for maximum performance.

Class: PufferEnv

Required attributes

Before calling super().__init__(), your environment must define:

single_observation_space

pufferlib.spaces.Box

required

The observation space for a single agent. Must be a Box space.

single_action_space

pufferlib.spaces.Discrete | pufferlib.spaces.MultiDiscrete | pufferlib.spaces.Box

required

The action space for a single agent. Must be Discrete, MultiDiscrete, or Box.

num_agents

int

required

Number of agents in the environment. Must be >= 1.

Initialization

class PufferEnv:
    def __init__(self, buf=None):

buf

dict | None

Optional pre-allocated buffer dictionary containing numpy arrays for observations, rewards, terminals, truncations, masks, and actions. Used internally for zero-copy vectorization.

Properties

After initialization, PufferEnv provides:

observations

numpy.ndarray

Array of shape (num_agents, *obs_shape) containing current observations

rewards

numpy.ndarray

Array of shape (num_agents,) containing rewards

terminals

numpy.ndarray

Boolean array of shape (num_agents,) indicating terminal states

truncations

numpy.ndarray

Boolean array of shape (num_agents,) indicating truncated episodes

masks

numpy.ndarray

Boolean array of shape (num_agents,) indicating active agents

actions

numpy.ndarray

Array of shape (num_agents, *action_shape) for storing actions

observation_space

gymnasium.spaces.Space

Joint observation space for all agents (automatically created from single_observation_space)

action_space

gymnasium.spaces.Space

Joint action space for all agents (automatically created from single_action_space)

agent_ids

numpy.ndarray

Array of agent IDs (0 to num_agents-1)

emulated

bool

Always False for native environments. Indicates whether the environment uses emulation.

done

bool

Always False for native environments. Native envs handle resets internally.

driver_env

PufferEnv

Returns self. Used for compatibility with Multiprocessing.

Methods

reset()

def reset(self, seed=None) -> tuple[numpy.ndarray, list[dict]]

Reset the environment and return initial observations.

seed

int | None

Random seed for reproducibility

observations

numpy.ndarray

Initial observations (written to self.observations)

infos

list[dict]

List of info dicts, one per agent

You must implement this method in your subclass.

step()

def step(self, actions) -> tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, list[dict]]

Execute one environment step.

actions

numpy.ndarray

required

Actions for all agents, shape (num_agents,) or (num_agents, *action_shape)

observations

numpy.ndarray

Next observations (written to self.observations)

rewards

numpy.ndarray

Rewards for each agent (written to self.rewards)

terminals

numpy.ndarray

Terminal flags for each agent (written to self.terminals)

truncations

numpy.ndarray

Truncation flags for each agent (written to self.truncations)

infos

list[dict]

List of info dicts, one per agent

You must implement this method in your subclass.

close()

def close()

Clean up environment resources.

You must implement this method in your subclass.

Async interface

PufferEnv provides an async-style interface for advanced vectorization:

async_reset()

def async_reset(self, seed=None)

Asynchronously reset the environment. Calls reset() internally and stores infos.

seed

int | None

Random seed for reproducibility

send()

def send(self, actions)

Send actions to the environment without waiting for the result.

actions

numpy.ndarray

required

Actions for all agents

recv()

def recv() -> tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, list[dict], numpy.ndarray, numpy.ndarray]

Retrieve the results of the last step.

observations

numpy.ndarray

Current observations

rewards

numpy.ndarray

Current rewards

terminals

numpy.ndarray

Terminal flags

truncations

numpy.ndarray

Truncation flags

infos

list[dict]

Info dictionaries

agent_ids

numpy.ndarray

Agent IDs

masks

numpy.ndarray

Active agent masks

Example usage

import numpy as np
import pufferlib
import pufferlib.spaces

class MyEnv(pufferlib.PufferEnv):
    def __init__(self, buf=None):
        # Define required attributes BEFORE calling super()
        self.single_observation_space = pufferlib.spaces.Box(
            low=0, high=255, shape=(84, 84, 3), dtype=np.uint8
        )
        self.single_action_space = pufferlib.spaces.Discrete(4)
        self.num_agents = 8
        
        # Now call super() to initialize buffers
        super().__init__(buf)
    
    def reset(self, seed=None):
        # Reset environment state
        self.observations[:] = self.single_observation_space.sample()
        infos = [{} for _ in range(self.num_agents)]
        return self.observations, infos
    
    def step(self, actions):
        # Execute environment step
        self.observations[:] = self.single_observation_space.sample()
        self.rewards[:] = np.random.randn(self.num_agents)
        self.terminals[:] = False
        self.truncations[:] = False
        infos = [{} for _ in range(self.num_agents)]
        return self.observations, self.rewards, self.terminals, self.truncations, infos
    
    def close(self):
        pass

# Usage
env = MyEnv()
observations, infos = env.reset()

for _ in range(100):
    actions = env.action_space.sample()
    obs, rewards, terminals, truncations, infos = env.step(actions)

env.close()

Common errors

APIUsageError: Environment missing required attributeThis error occurs when you call super().__init__() before defining single_observation_space, single_action_space, or num_agents. Always define these attributes first.

APIUsageError: PufferEnvs must define single_observation_space, not observation_spaceDo not define observation_space or action_space directly. PufferLib creates these automatically from your single-agent spaces.

APIUsageError: Native observation_space must be a BoxPufferEnv only supports Box observation spaces. If you need discrete observations, convert them to a Box representation.

Core API

Training

Emulation

Utilities

Overview

Class: PufferEnv

Required attributes

Initialization

Properties

Methods

reset()

step()

close()

Async interface

async_reset()

send()

recv()

Example usage

Common errors

Build docs developers (and LLMs) love

Core API

Training

Emulation

Utilities

​Overview

​Class: PufferEnv

​Required attributes

​Initialization

​Properties

​Methods

​reset()

​step()

​close()

​Async interface

​async_reset()

​send()

​recv()

​Example usage

​Common errors

Build docs developers (and LLMs) love

Overview

Class: PufferEnv

Required attributes

Initialization

Properties

Methods

reset()

step()

close()

Async interface

async_reset()

send()

recv()

Example usage

Common errors