Skip to main content

Requirements

Before installing CooperBench, ensure you have:
  • Python 3.10+ (Python 3.12+ recommended)
  • Execution backend (choose one):
    • Modal - Cloud-based, easiest to set up (default)
    • GCP - Google Cloud Platform for scale
    • Docker - Local execution for development
  • Redis - For inter-agent communication in cooperative mode
  • LLM API keys - From supported providers (Anthropic, OpenAI, Google Gemini, etc.)

Basic installation

pip install cooperbench
The basic installation includes Modal as the default execution backend. You can add other backends using optional dependencies.

Backend setup

Choose one execution backend based on your needs:

Redis setup

Redis is required for inter-agent communication in cooperative mode.
# Run Redis in a Docker container
docker run -d -p 6379:6379 redis:7

# Verify Redis is running
redis-cli ping
# Should return: PONG
For solo experiments, Redis is optional. You can skip this step if you only plan to run single-agent experiments.

LLM API keys

CooperBench supports multiple LLM providers. Configure your API keys in a .env file:
1

Create .env file

In your project root, create a .env file:
touch .env
2

Add API keys

Add your provider keys to .env:
.env
# Anthropic (Claude models)
ANTHROPIC_API_KEY=sk-ant-...

# OpenAI (GPT models)
OPENAI_API_KEY=sk-...

# Google (Gemini models)
GEMINI_API_KEY=...

# Optional: Other providers
COHERE_API_KEY=...
TOGETHER_API_KEY=...
3

Verify configuration

The keys will be automatically loaded when running experiments.
# Test with a small experiment
cooperbench run -n test -r llama_index_task -m gpt-4o --setting solo
Never commit your .env file to version control. Add it to .gitignore:
echo ".env" >> .gitignore

Dataset download

Download the CooperBench dataset from HuggingFace:
git clone https://huggingface.co/datasets/cooperbench/cooperbench dataset/
This will create a dataset/ directory containing all 652 tasks across 12 repositories.
dataset/
  llama_index_task/
    task1/
      setup.sh
      run_tests.sh
      feature1/
        feature.md
        feature.patch
        tests.patch
      feature2/
        ...
    task2/
      ...
  dspy_task/
    ...
  # ... 12 repositories total

Verify installation

Confirm everything is set up correctly:
# Check CooperBench version
cooperbench --help

# Check Python version
python --version  # Should be 3.10+

# Check backend
modal --version  # or docker --version, or gcloud --version

# Check Redis connection
redis-cli ping  # Should return PONG

# List available tasks
ls dataset/  # Should show repository directories
If all commands succeed, you’re ready to run experiments!

Optional dependencies

CooperBench provides several optional dependency groups:
GroupInstallPurpose
gcppip install 'cooperbench[gcp]'Google Cloud Platform backend
swe-agentpip install 'cooperbench[swe-agent]'SWE-agent framework support
devpip install 'cooperbench[dev]'Development tools (pytest, mypy, ruff)
allpip install 'cooperbench[all]'All optional dependencies

Troubleshooting

CooperBench requires Python 3.10 or higher. Check your version:
python --version
If needed, install a newer Python version:
# Using pyenv
pyenv install 3.12
pyenv global 3.12

# Or download from python.org
If agents can’t connect to Redis:
  1. Verify Redis is running: redis-cli ping
  2. Check the Redis URL in your command:
    cooperbench run --redis redis://localhost:6379 ...
    
  3. For cloud Redis, use the full connection string:
    cooperbench run --redis redis://user:pass@host:port ...
    
If you get authentication errors:
  1. Verify .env file exists and contains keys
  2. Check key format matches provider requirements
  3. Test the key directly with the provider’s API
  4. Ensure .env is in the working directory where you run commands

Next steps

Quick start

Run your first CooperBench experiment

CLI reference

Explore all available commands and options