Setting comparison
Cooperative setting
2 agents collaborate on separate features with communication
Solo setting
1 agent implements both features sequentially
Quick comparison
| Aspect | Cooperative | Solo |
|---|---|---|
| Number of agents | 2 | 1 |
| Features per agent | 1 | 2 |
| Total workload | Same | Same |
| Communication | Redis messaging | None |
| Git collaboration | Optional | N/A |
| Concurrency | Parallel execution | Sequential |
| Complexity | High (coordination) | Low (isolation) |
Cooperative setting
In cooperative mode, two agents work simultaneously on separate features, simulating a team development scenario.Architecture
How it works
Feature assignment
Each of the two features is assigned to a separate agent. Agents work in isolated sandboxes.
Parallel execution
Both agents start simultaneously and work in parallel, implementing their assigned features.
Running cooperative mode
Communication mechanisms
Agents in cooperative mode have two ways to collaborate:- Redis messaging
- Git collaboration
Default communication channel for inter-agent messagesFeatures:
- Async message passing
- Namespaced by run ID
- Messages appear in agent context
- Tracked in conversation logs
Research shows agents spend up to 20% of their budget on messaging, reducing conflicts but not significantly improving success rates.
Configuration options
Solo setting
In solo mode, a single agent implements both features sequentially, providing a baseline without coordination overhead.Architecture
How it works
Sequential implementation
The agent implements both features in a single session, with full context of both requirements.
Running solo mode
Advantages
No coordination overhead
Agent doesn’t need to communicate or merge with others
Full context
Agent sees both feature requirements upfront
Simpler execution
No messaging, git servers, or merge conflicts
Baseline performance
Shows maximum achievable without coordination
When to use each setting
- Use cooperative when...
- Use solo when...
Understanding the coordination deficit
The performance gap between settings reveals coordination challenges:Coordination deficit formula:Example: If solo achieves 50% and coop achieves 25%:
Research findings
GPT-4o performance
GPT-4o performance
- Solo: ~50% success rate
- Cooperative: ~25% success rate
- Deficit: 50% performance loss due to coordination
Claude Sonnet 4.5 performance
Claude Sonnet 4.5 performance
- Solo: ~45% success rate
- Cooperative: ~22% success rate
- Deficit: 51% performance loss due to coordination
Communication impact
Communication impact
- Agents use 10-20% of budget on messaging
- Reduces merge conflicts by ~15%
- Does not improve overall success rates
- Indicates communication quality issues
Output structure
Results are organized differently per setting:Comparing results
After running both settings, compare results:What’s next?
System architecture
Learn how settings are executed under the hood
Run experiments
Start running benchmarks with different settings
CLI reference
Complete command options for both settings
Dataset overview
Explore the benchmark task structure