Available Presets
Tutorial
1-year horizon • Tests basic loop executionForgiving environment for testing basic CLI discovery and the accept → assign → dispatch → resume loop.
Easy
1-year horizon • Tests throughput awarenessSingle-domain tasks with moderate deadlines. Tests whether agents understand that parallelism dilutes throughput.
Medium
1-year horizon • Tests domain specializationPrestige ladder active, 2-domain tasks. Agents must specialize in a few domains to unlock higher-reward tiers.
Hard
1-year horizon • Tests capacity planningTight deadlines, heavy penalties, limited runway. Requires precise ETA calculation and conservative task acceptance.
Nightmare
1-year horizon • Tests sustained perfect playRazor-thin margins, aggressive compounding, steep prestige requirements. One mistake cascades into bankruptcy.
Preset Comparison
The table below highlights the key differences between presets:| Parameter | Tutorial | Easy | Medium | Hard | Nightmare | Default |
|---|---|---|---|---|---|---|
| Starting Funds | $250,000 | $200,000 | $150,000 | $100,000 | $80,000 | $150,000 |
| Horizon | 1 year | 1 year | 1 year | 1 year | 1 year | 3 years |
| Prestige Mode | Constant 1 | Tri(1,4,1) | Tri(1,7,3) | Tri(1,8,4) | Tri(1,10,5) | Tri(1,10,4) |
| Domain Count | Constant 1 | Constant 1 | Tri(1,3,2) | Tri(1,3,2) | Tri(1,3,2) | Tri(1,3,2) |
| Required Qty | Tri(300,1200,600) | Tri(500,2000,1000) | Tri(700,3000,1500) | Tri(1000,4000,2000) | Tri(1200,5000,2500) | Tri(800,4000,2000) |
| Deadline (qty/day) | 50 | 100 | 150 | 220 | 220 | 200 |
| Fail Penalty | 0.3× | 0.8× | 1.0× | 1.4× | 2.0× | 1.4× |
| Cancel Penalty | 0.5× | 1.2× | 1.5× | 2.0× | 2.5× | 2.0× |
| Salary Bump % | 0% | 0.5% | 1% | 1% | 2% | 1% |
| Reward Scale | 0.2 | 0.3 | 0.45 | 0.55 | 0.7 | 0.55 |
Notation:
Tri(low, high, mode) = triangular distribution with given parameters. See Parameters for details.What Each Preset Tests
Tutorial
Key Question: Can the agent execute the basic loop?- Starting runway: ~16 months with 10 employees
- Monthly payroll: ~$15K
- Mode task: 1 domain × 600 units, 7-day deadline
- A single mid-tier employee can finish in 7.4 days
- Does the agent discover the CLI commands?
- Does it call
sim resumeto advance time? - Can it read JSON output and act on it?
Easy
Key Question: Does the agent understand throughput dilution?- Starting runway: ~7.8 months with 10 employees
- Monthly payroll: ~$32K
- Mode task: 1 domain × 1000 units, 10-day deadline
- Team throughput on 1 task: 230 units/day → 3 days
- On 4 parallel tasks: 57 units/day → 12 days (FAIL)
- Does the agent understand that parallel tasks split employee rates?
- Does it keep ≤2 tasks active at a time?
- Can it sequence tasks rather than batch?
Medium
Key Question: Can the agent climb the prestige ladder strategically?- Starting runway: ~7.8 months with 10 employees
- Monthly payroll: ~$32K
- Mode task: 2 domains × 1500 units, 10-day deadline
- Prestige-1 reward: ~$30K
- Prestige-4 reward: 70K
- Does the agent understand prestige gates market access?
- Does it specialize in 2–3 domains rather than spreading thin?
- Can it handle 2-domain task assignments effectively?
Hard
Key Question: Can the agent compute ETAs and never overcommit?- Starting runway: ~5.4 months with 10 employees
- Monthly payroll: ~$46K
- Mode task: 2 domains × 2000 units, 9-day deadline
- Split 4+3 employees: finishes in 8.7 days (just fits!)
- Dispatching a second task splits all rates → both tasks miss
- Can the agent estimate completion time vs. deadline?
- Does it understand that new dispatches degrade existing tasks?
- Can it manage cash flow with 5.4-month runway?
- Does it resist “tempting” high-reward tasks it can’t finish?
Nightmare
Key Question: Can the agent sustain perfect play for an entire year?- Starting runway: ~4.8 months with 10 employees
- Monthly payroll: ~$52K initially, grows 30–50% over the year
- Mode task: 2 domains × 2500 units, 11-day deadline
- Revenue at prestige-1: ~22K/month)
- Revenue at prestige-5: ~$114K (now profitable)
- The race: Climb to prestige 5 before month 5 or die
- Can the agent survive a 4.8-month clock to profitability?
- Does it plan a prestige climb path across 2–3 domains?
- Can it handle 3-domain assignments without throughput collapse?
- Does it account for salary growth in long-term planning?
- Can it resist every temptation to over-accept?
Specifying a Preset
Use the--config flag to specify a preset:
tutorialeasymediumhardnightmaredefault(the 3-year hardened benchmark)
Preset Inheritance
All presets useextends = "default" and override only specific parameters. This means:
- Every parameter not explicitly overridden inherits from
default.toml - Parameters like
num_employees,num_market_tasks, and salary tier distributions are consistent across presets - You can inspect the full effective configuration by examining both files
tutorial preset only overrides 13 parameters but inherits 50+ others from default.
Next Steps
Parameters
Complete reference of all tunable parameters in default.toml
Tuning
Learn how to create your own custom presets