Basic usage

PufferLib provides seamless integration with both legacy Gym and modern Gymnasium environments. This guide shows you how to get started with basic environment usage, vectorization, and rendering.

Working with Gymnasium environments

Gymnasium is the modern successor to OpenAI Gym. PufferLib makes it easy to wrap Gymnasium environments into the PufferEnv format for high-performance training.

import gymnasium
import pufferlib.emulation

class SampleGymnasiumEnv(gymnasium.Env):
    def __init__(self):
        self.observation_space = gymnasium.spaces.Box(low=-1, high=1, shape=(1,))
        self.action_space = gymnasium.spaces.Discrete(2)

    def reset(self):
        return self.observation_space.sample(), {}

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, False, {}

The GymnasiumPufferEnv wrapper converts your Gymnasium environment into a PufferEnv that supports vectorized operations and efficient batch processing.

Working with legacy Gym environments

If you’re working with older codebases that use the legacy Gym API, PufferLib provides a compatibility layer.

Create your Gym environment

import gym
import pufferlib.emulation

class SampleGymEnv(gym.Env):
    def __init__(self):
        self.observation_space = gym.spaces.Box(low=-1, high=1, shape=(1,))
        self.action_space = gym.spaces.Discrete(2)

    def reset(self):
        return self.observation_space.sample()

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, {}

Convert Gym to Gymnasium

gym_env = SampleGymEnv()
gymnasium_env = pufferlib.GymToGymnasium(gym_env)

Wrap in PufferEnv

puffer_env = pufferlib.emulation.GymnasiumPufferEnv(gymnasium_env)
observations, info = puffer_env.reset()
action = puffer_env.action_space.sample()
observation, reward, terminal, truncation, info = puffer_env.step(action)

Environment vectorization

PufferLib’s vectorization capabilities allow you to run multiple environments in parallel for faster data collection. The library supports multiple backends for different use cases.

Serial vectorization

The serial backend runs environments sequentially in a single process. This is useful for debugging and small-scale experiments.

import pufferlib.vector

serial_vecenv = pufferlib.vector.make(
    SamplePufferEnv, 
    num_envs=2, 
    backend=pufferlib.vector.Serial
)

observations, infos = serial_vecenv.reset()
actions = serial_vecenv.action_space.sample()
o, r, d, t, i = serial_vecenv.step(actions)

print('Serial VecEnv:')
print('Observations:', o)
print('Rewards:', r)
print('Terminals:', t)
print('Truncations:', d)

Multiprocessing vectorization

For production training, use the multiprocessing backend to run environments in parallel across multiple CPU cores.

vecenv = pufferlib.vector.make(
    SamplePufferEnv,
    num_envs=2, 
    num_workers=2, 
    batch_size=1, 
    backend=pufferlib.vector.Multiprocessing
)

# Asynchronous API for maximum throughput
vecenv.async_reset()
o, r, d, t, i, env_ids, masks = vecenv.recv()

actions = vecenv.action_space.sample()
print('Actions:', actions)
vecenv.send(actions)

# New observations are ready while other envs run in the background
o, r, d, t, i, env_ids, masks = vecenv.recv()
print('Observations:', o)

vecenv.close()

Make sure num_envs divides num_workers, and both should divide batch_size evenly. PufferLib will raise an APIUsageError if these constraints are violated.

Passing environment arguments

You can customize environment initialization by passing arguments and keyword arguments:

Same args for all environments
Different args per environment

serial_vecenv = pufferlib.vector.make(
    SamplePufferEnv, 
    num_envs=2, 
    backend=pufferlib.vector.Serial,
    env_args=[3], 
    env_kwargs={'bar': 4}
)
print('Foo:', [env.foo for env in serial_vecenv.envs])  # [3, 3]
print('Bar:', [env.bar for env in serial_vecenv.envs])  # [4, 4]

serial_vecenv = pufferlib.vector.make(
    [SamplePufferEnv, SamplePufferEnv], 
    num_envs=2, 
    backend=pufferlib.vector.Serial,
    env_args=[[3], [4]], 
    env_kwargs=[{'bar': 4}, {'bar': 5}]
)
print('Foo:', [env.foo for env in serial_vecenv.envs])  # [3, 4]
print('Bar:', [env.bar for env in serial_vecenv.envs])  # [4, 5]

Handling structured observations

PufferLib automatically flattens structured observation spaces (Dict, Tuple, MultiDiscrete) for efficient neural network processing.

import gymnasium
import pufferlib.emulation

class SampleGymnasiumEnv(gymnasium.Env):
    def __init__(self):
        self.observation_space = gymnasium.spaces.Dict({
            'foo': gymnasium.spaces.Box(low=-1, high=1, shape=(2,)),
            'bar': gymnasium.spaces.Box(low=2, high=3, shape=(3,)),
        })
        self.action_space = gymnasium.spaces.MultiDiscrete([2, 5])

    def reset(self):
        return self.observation_space.sample(), {}

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, False, {}

gymnasium_env = SampleGymnasiumEnv()
puffer_env = pufferlib.emulation.GymnasiumPufferEnv(gymnasium_env)
flat_observation, info = puffer_env.reset()
flat_action = puffer_env.action_space.sample()

print(f'PufferLib flattens observations and actions:\n{flat_observation}\n{flat_action}')

Unflattening observations

You can unflatten observations using NumPy or PyTorch:

observation = flat_observation.view(puffer_env.obs_dtype)
print(f'Unflattened with numpy:\n{observation}')

We recommend unflattening observations with PyTorch in your model’s forward pass for better performance and easier integration with neural networks.

Rendering environments

PufferLib environments support rendering for visualization and debugging. Here’s an example using the built-in Breakout environment:

from pufferlib.ocean.breakout import breakout

env = breakout.Breakout()
env.reset()

while True:
    env.step(env.action_space.sample())
    frame = env.render()
    # Display or save the frame as needed

The render() method returns an RGB array that you can display using your preferred visualization library (matplotlib, OpenCV, etc.).

Getting Started

Core Concepts

Training

Environment Wrappers

Ocean Environments

Advanced

Examples

Working with Gymnasium environments

Working with legacy Gym environments

Environment vectorization

Serial vectorization

Multiprocessing vectorization

Passing environment arguments

Handling structured observations

Unflattening observations

Rendering environments

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Training

Environment Wrappers

Ocean Environments

Advanced

Examples

​Working with Gymnasium environments

​Working with legacy Gym environments

​Environment vectorization

​Serial vectorization

​Multiprocessing vectorization

​Passing environment arguments

​Handling structured observations

​Unflattening observations

​Rendering environments

Build docs developers (and LLMs) love

Working with Gymnasium environments

Working with legacy Gym environments

Environment vectorization

Serial vectorization

Multiprocessing vectorization

Passing environment arguments

Handling structured observations

Unflattening observations

Rendering environments