Skip to main content
PufferLib provides seamless integration with both legacy Gym and modern Gymnasium environments. This guide shows you how to get started with basic environment usage, vectorization, and rendering.

Working with Gymnasium environments

Gymnasium is the modern successor to OpenAI Gym. PufferLib makes it easy to wrap Gymnasium environments into the PufferEnv format for high-performance training.
import gymnasium
import pufferlib.emulation

class SampleGymnasiumEnv(gymnasium.Env):
    def __init__(self):
        self.observation_space = gymnasium.spaces.Box(low=-1, high=1, shape=(1,))
        self.action_space = gymnasium.spaces.Discrete(2)

    def reset(self):
        return self.observation_space.sample(), {}

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, False, {}
The GymnasiumPufferEnv wrapper converts your Gymnasium environment into a PufferEnv that supports vectorized operations and efficient batch processing.

Working with legacy Gym environments

If you’re working with older codebases that use the legacy Gym API, PufferLib provides a compatibility layer.
1

Create your Gym environment

import gym
import pufferlib.emulation

class SampleGymEnv(gym.Env):
    def __init__(self):
        self.observation_space = gym.spaces.Box(low=-1, high=1, shape=(1,))
        self.action_space = gym.spaces.Discrete(2)

    def reset(self):
        return self.observation_space.sample()

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, {}
2

Convert Gym to Gymnasium

gym_env = SampleGymEnv()
gymnasium_env = pufferlib.GymToGymnasium(gym_env)
3

Wrap in PufferEnv

puffer_env = pufferlib.emulation.GymnasiumPufferEnv(gymnasium_env)
observations, info = puffer_env.reset()
action = puffer_env.action_space.sample()
observation, reward, terminal, truncation, info = puffer_env.step(action)

Environment vectorization

PufferLib’s vectorization capabilities allow you to run multiple environments in parallel for faster data collection. The library supports multiple backends for different use cases.

Serial vectorization

The serial backend runs environments sequentially in a single process. This is useful for debugging and small-scale experiments.
import pufferlib.vector

serial_vecenv = pufferlib.vector.make(
    SamplePufferEnv, 
    num_envs=2, 
    backend=pufferlib.vector.Serial
)

observations, infos = serial_vecenv.reset()
actions = serial_vecenv.action_space.sample()
o, r, d, t, i = serial_vecenv.step(actions)

print('Serial VecEnv:')
print('Observations:', o)
print('Rewards:', r)
print('Terminals:', t)
print('Truncations:', d)

Multiprocessing vectorization

For production training, use the multiprocessing backend to run environments in parallel across multiple CPU cores.
vecenv = pufferlib.vector.make(
    SamplePufferEnv,
    num_envs=2, 
    num_workers=2, 
    batch_size=1, 
    backend=pufferlib.vector.Multiprocessing
)

# Asynchronous API for maximum throughput
vecenv.async_reset()
o, r, d, t, i, env_ids, masks = vecenv.recv()

actions = vecenv.action_space.sample()
print('Actions:', actions)
vecenv.send(actions)

# New observations are ready while other envs run in the background
o, r, d, t, i, env_ids, masks = vecenv.recv()
print('Observations:', o)

vecenv.close()
Make sure num_envs divides num_workers, and both should divide batch_size evenly. PufferLib will raise an APIUsageError if these constraints are violated.

Passing environment arguments

You can customize environment initialization by passing arguments and keyword arguments:
serial_vecenv = pufferlib.vector.make(
    SamplePufferEnv, 
    num_envs=2, 
    backend=pufferlib.vector.Serial,
    env_args=[3], 
    env_kwargs={'bar': 4}
)
print('Foo:', [env.foo for env in serial_vecenv.envs])  # [3, 3]
print('Bar:', [env.bar for env in serial_vecenv.envs])  # [4, 4]

Handling structured observations

PufferLib automatically flattens structured observation spaces (Dict, Tuple, MultiDiscrete) for efficient neural network processing.
import gymnasium
import pufferlib.emulation

class SampleGymnasiumEnv(gymnasium.Env):
    def __init__(self):
        self.observation_space = gymnasium.spaces.Dict({
            'foo': gymnasium.spaces.Box(low=-1, high=1, shape=(2,)),
            'bar': gymnasium.spaces.Box(low=2, high=3, shape=(3,)),
        })
        self.action_space = gymnasium.spaces.MultiDiscrete([2, 5])

    def reset(self):
        return self.observation_space.sample(), {}

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, False, {}

gymnasium_env = SampleGymnasiumEnv()
puffer_env = pufferlib.emulation.GymnasiumPufferEnv(gymnasium_env)
flat_observation, info = puffer_env.reset()
flat_action = puffer_env.action_space.sample()

print(f'PufferLib flattens observations and actions:\n{flat_observation}\n{flat_action}')

Unflattening observations

You can unflatten observations using NumPy or PyTorch:
observation = flat_observation.view(puffer_env.obs_dtype)
print(f'Unflattened with numpy:\n{observation}')
We recommend unflattening observations with PyTorch in your model’s forward pass for better performance and easier integration with neural networks.

Rendering environments

PufferLib environments support rendering for visualization and debugging. Here’s an example using the built-in Breakout environment:
from pufferlib.ocean.breakout import breakout

env = breakout.Breakout()
env.reset()

while True:
    env.step(env.action_space.sample())
    frame = env.render()
    # Display or save the frame as needed
The render() method returns an RGB array that you can display using your preferred visualization library (matplotlib, OpenCV, etc.).

Build docs developers (and LLMs) love