Overview
Backends determine where agent tasks and test evaluations run:- Modal - Cloud execution (default, easiest)
- Docker - Local containers (no cloud account needed)
- GCP - Google Cloud Platform VMs (scalable, custom infrastructure)
Selecting a backend
Use the--backend flag with run or eval commands:
Modal (default)
Overview
Modal is a serverless cloud platform that runs tasks in ephemeral containers. Pros:- No setup required
- Scales automatically
- Fast cold starts
- Pay only for usage
- Requires Modal account
- Internet connection required
- Costs money (free tier available)
Setup
- Create Modal account: Visit modal.com and sign up.
-
Install Modal:
-
Authenticate:
This opens your browser to authenticate.
-
Run tasks:
Configuration
No additional configuration needed. Modal authenticates via~/.modal.toml.
Example usage
Docker (local)
Overview
Docker backend runs tasks in local containers using your machine’s resources. Pros:- No cloud account needed
- Works offline
- No usage costs
- Full control
- Limited by local resources
- Must have Docker installed
- Slower than cloud for large workloads
Setup
-
Install Docker:
-
Start Docker daemon:
Make sure Docker Desktop is running, or:
-
Verify installation:
-
Run tasks:
Configuration
Optional environment variable:Example usage
Performance tips
- Limit concurrency: Use
-c 2or-c 5to avoid overwhelming your machine - Increase Docker resources: In Docker Desktop, allocate more CPUs/memory
- Clean up containers: Run
docker system pruneperiodically
GCP (Google Cloud Platform)
Overview
GCP backend runs tasks on Google Cloud VMs using Cloud Batch. Pros:- Highly scalable
- Custom machine types
- Integrate with GCP infrastructure
- More control than Modal
- Requires GCP account and billing
- More complex setup
- Need to manage quotas and resources
Setup
-
Run configuration wizard:
This interactive wizard:
- Checks for
gcloudCLI - Authenticates with GCP
- Configures project, region, zone
- Validates API access
- Checks for
- Enable required APIs: The following APIs must be enabled in your GCP project:
- Set up billing: Ensure your project has billing enabled.
-
Run tasks:
Configuration
Configuration is stored in~/.config/cooperbench/config.json:
Example usage
Performance tips
- Choose optimal region: Use regions close to your location or data
- Check quotas: GCP has default quotas; request increases if needed
- Monitor costs: Use GCP billing console to track spending
Choosing a backend
| Factor | Modal | Docker | GCP |
|---|---|---|---|
| Setup complexity | Easy | Easy | Medium |
| Cost | Pay-per-use | Free | Pay-per-use |
| Scalability | High | Low | High |
| Internet required | Yes | No | Yes |
| Requires account | Yes | No | Yes |
| Best for | Quick experiments | Local dev/testing | Production workloads |
Recommendations
For quick experiments: Use Modal. Minimal setup, fast, scales automatically.Backend-specific features
Modal
- Automatic retries on failure
- Distributed tracing in Modal dashboard
- GPU support (if configured)
Docker
- Use custom Docker images
- Mount local volumes for debugging
- Offline operation
GCP
- Custom machine types
- Persistent disk support
- VPC networking
- Integrate with Cloud Storage, BigQuery, etc.
Troubleshooting
Modal issues
“Not authenticated”:-c flag.