Installation

Requirements

Before installing CooperBench, ensure you have:

Python 3.10+ (Python 3.12+ recommended)
Execution backend (choose one):
- Modal - Cloud-based, easiest to set up (default)
- GCP - Google Cloud Platform for scale
- Docker - Local execution for development
Redis - For inter-agent communication in cooperative mode
LLM API keys - From supported providers (Anthropic, OpenAI, Google Gemini, etc.)

Basic installation

pip install cooperbench

The basic installation includes Modal as the default execution backend. You can add other backends using optional dependencies.

Backend setup

Choose one execution backend based on your needs:

Modal (Default)
GCP (Recommended for Scale)
Docker (Local)

Modal provides cloud-based sandboxed execution with minimal setup.

Create Modal account

Authenticate Modal

modal setup

Follow the prompts to authenticate with your Modal account.

Verify installation

modal --help

You should see Modal’s help output.

Modal is the recommended backend for most users. It handles sandboxing, scaling, and infrastructure automatically.

Google Cloud Platform provides robust infrastructure for running experiments at scale.

Install gcloud CLI

brew install google-cloud-sdk

Install GCP dependencies

pip install 'cooperbench[gcp]'

Run configuration wizard

cooperbench config gcp

The wizard will:

Authenticate with your Google account
Set up a GCP project
Enable required APIs (Batch, Compute, Storage)
Configure credentials and permissions
Run validation tests

Use --skip-tests flag to skip validation tests for faster setup:

cooperbench config gcp --skip-tests

Verify setup

cooperbench run --backend gcp -s lite --help

GCP requires a billing account. You’ll be charged for VM instances and storage used during experiments. See GCP pricing for details.

Docker enables local execution for development and testing.

Install Docker

Download and install Docker Desktop from docker.com.

Start Docker daemon

Ensure Docker is running:

docker --version
docker ps

Verify setup

cooperbench run --backend docker --help

Docker backend is intended for local testing only. It may not provide the same isolation guarantees as Modal or GCP.

Redis setup

Redis is required for inter-agent communication in cooperative mode.

# Run Redis in a Docker container
docker run -d -p 6379:6379 redis:7

# Verify Redis is running
redis-cli ping
# Should return: PONG

For solo experiments, Redis is optional. You can skip this step if you only plan to run single-agent experiments.

LLM API keys

CooperBench supports multiple LLM providers. Configure your API keys in a .env file:

Create .env file

In your project root, create a .env file:

touch .env

Add API keys

Add your provider keys to .env:

.env

# Anthropic (Claude models)
ANTHROPIC_API_KEY=sk-ant-...

# OpenAI (GPT models)
OPENAI_API_KEY=sk-...

# Google (Gemini models)
GEMINI_API_KEY=...

# Optional: Other providers
COHERE_API_KEY=...
TOGETHER_API_KEY=...

Verify configuration

The keys will be automatically loaded when running experiments.

# Test with a small experiment
cooperbench run -n test -r llama_index_task -m gpt-4o --setting solo

Never commit your .env file to version control. Add it to .gitignore:

echo ".env" >> .gitignore

Dataset download

Download the CooperBench dataset from HuggingFace:

git clone https://huggingface.co/datasets/cooperbench/cooperbench dataset/

This will create a dataset/ directory containing all 652 tasks across 12 repositories.

Dataset structure

dataset/
  llama_index_task/
    task1/
      setup.sh
      run_tests.sh
      feature1/
        feature.md
        feature.patch
        tests.patch
      feature2/
        ...
    task2/
      ...
  dspy_task/
    ...
  # ... 12 repositories total

Verify installation

Confirm everything is set up correctly:

# Check CooperBench version
cooperbench --help

# Check Python version
python --version  # Should be 3.10+

# Check backend
modal --version  # or docker --version, or gcloud --version

# Check Redis connection
redis-cli ping  # Should return PONG

# List available tasks
ls dataset/  # Should show repository directories

If all commands succeed, you’re ready to run experiments!

Optional dependencies

CooperBench provides several optional dependency groups:

Group	Install	Purpose
`gcp`	`pip install 'cooperbench[gcp]'`	Google Cloud Platform backend
`swe-agent`	`pip install 'cooperbench[swe-agent]'`	SWE-agent framework support
`dev`	`pip install 'cooperbench[dev]'`	Development tools (pytest, mypy, ruff)
`all`	`pip install 'cooperbench[all]'`	All optional dependencies

Troubleshooting

Python version errors

CooperBench requires Python 3.10 or higher. Check your version:

python --version

If needed, install a newer Python version:

# Using pyenv
pyenv install 3.12
pyenv global 3.12

# Or download from python.org

Modal authentication fails

Redis connection errors

If agents can’t connect to Redis:

Verify Redis is running: redis-cli ping

Check the Redis URL in your command:

cooperbench run --redis redis://localhost:6379 ...

For cloud Redis, use the full connection string:

cooperbench run --redis redis://user:pass@host:port ...

LLM API key errors

If you get authentication errors:

Verify .env file exists and contains keys
Check key format matches provider requirements
Test the key directly with the provider’s API
Ensure .env is in the working directory where you run commands

Next steps

Quick start

Run your first CooperBench experiment

CLI reference

Explore all available commands and options

Get Started

Core Concepts

Guides

Results & Analysis

Requirements

Basic installation

Backend setup

Redis setup

LLM API keys

Dataset download

Verify installation

Optional dependencies

Troubleshooting

Next steps

Quick start

CLI reference

Get Started

Core Concepts

Guides

Results & Analysis

​Requirements

​Basic installation

​Backend setup

​Redis setup

​LLM API keys

​Dataset download

​Verify installation

​Optional dependencies

​Troubleshooting

​Next steps

Quick start

CLI reference

Requirements

Basic installation

Backend setup

Redis setup

LLM API keys

Dataset download

Verify installation

Optional dependencies

Troubleshooting

Next steps