PufferEnv
Base class for native PufferLib environments that handle multiple agents with vectorized operations.Constructor
Optional buffer dictionary containing pre-allocated arrays. If None, buffers are created automatically.
Subclasses must define
single_observation_space, single_action_space, and num_agents before calling super().__init__().Required attributes
Observation space for a single agent (must be Box).
Action space for a single agent (must be Discrete, MultiDiscrete, or Box).
Number of agents (must be >= 1).
Properties
Joint observation space for all agents.
Joint action space for all agents.
Buffer for observations, shape (num_agents, *obs_shape).
Buffer for rewards, shape (num_agents,), dtype float32.
Buffer for terminal flags, shape (num_agents,), dtype bool.
Buffer for truncation flags, shape (num_agents,), dtype bool.
Buffer for agent masks, shape (num_agents,), dtype bool.
Buffer for actions.
Array of agent IDs (0 to num_agents-1).
Always False for native environments.
Always False (native envs handle resets internally).
Methods
reset
Optional random seed.
step
Actions for all agents.
close
async_reset
send
recv
ResizeObservation
Downscales image observations using fast strided indexing.Constructor
The environment to wrap.
Downscale factor. Observation dimensions must be divisible by this value.
Example
Methods
reset
step
ClipAction
Clips continuous actions to valid bounds for Box action spaces.Constructor
Environment with Box action space.
This wrapper expands the action space bounds to dtype limits while clipping actual actions to the original bounds. Useful when your policy might output out-of-bounds values.
Example
Methods
step
EpisodeStats
Tracks episodic returns and lengths for single-agent environments.Constructor
The environment to wrap.
Behavior
- Accumulates rewards and step counts during episodes
- Adds
episode_returnandepisode_lengthto info on episode end - Aggregates nested info values (sums numeric values)
Example
Methods
reset
step
PettingZooWrapper
Base wrapper for PettingZoo parallel environments with proper attribute access.Constructor
The PettingZoo parallel environment to wrap.
Methods
This wrapper forwards all methods to the wrapped environment:reset(seed=None, options=None)step(action)observation_space(agent)action_space(agent)observe(agent)state()render()close()
Properties
Currently active agents.
All possible agents.
The base unwrapped environment.
MeanOverAgents
Averages info values across agents in PettingZoo environments.Constructor
The PettingZoo environment to wrap.
Behavior
Converts per-agent info dicts to a single dict with mean values across agents. Numeric values are averaged; non-numeric values are skipped.Example
Methods
reset
step
MultiagentEpisodeStats
Tracks episodic statistics for each agent in PettingZoo environments.Constructor
The PettingZoo environment to wrap.
Behavior
- Tracks
episode_returnandepisode_lengthper agent - Adds statistics to info when each agent terminates or truncates
- Aggregates nested info values for each agent
Example
Methods
reset
step
GymToGymnasium
Converts old Gym API environments to Gymnasium API.Constructor
Old Gym environment (returns 4-tuple from step).
Behavior
- Converts 4-tuple
(obs, reward, done, info)to 5-tuple(obs, reward, terminated, truncated, info) - Sets
truncatedto alwaysFalse - Wraps
reset()to return(obs, {})
Example
PettingZooTruncatedWrapper
Adds proper truncation support to PettingZoo environments.Constructor
The PettingZoo environment to wrap.
Behavior
- Ensures reset returns empty info dicts for all agents
- Properly forwards truncation flags from step
Methods
reset
step
Utility functions
set_buffers
Environment to set buffers on.
Optional dictionary with pre-allocated arrays. If None, creates new buffers.
unroll_nested_dict
Dictionary to flatten.