Invocation
--auto flag skips all interactive round gates and runs to max_rounds. Use it for unattended runs. There is no cost circuit-breaker in --auto mode — set an appropriate max_rounds before starting.
Model tier strategy
Three tiers balance cost and quality. The coordinator (main session) always handles synthesis and gap detection — these are never delegated.| Tier | Model | Used for | Est. cost / agent |
|---|---|---|---|
| Scout | haiku | Depth ≥ 2 directions, low priority, low-stakes verification | ~$0.05 |
| Researcher | sonnet | Depth 0–1 high/medium, all seed directions | ~$0.30–0.60 |
| Deep Dive | opus | Re-exploration only when exhaustion_score ≤ 2 | ~$3–5 |
Pre-run scope declaration
Before any agents are spawned the skill shows you a scope declaration and waits for your confirmation:max_rounds is a soft gate — when reached with a non-empty frontier the skill prompts you to extend. You can always add rounds. Only --auto converts it to a hard stop. The absolute ceiling is max_rounds × 3.
Recommendation formula:
Workflow
Phase 0: Seed validation
Before any directions are generated, three checks fire in sequence:
- Safety check — if the seed requests harmful or illegal research, refuse immediately.
- Ambiguity check — if the seed has multiple plausible interpretations, confirm which one to use before proceeding.
- Input validation — if the seed is too thin (a single proper noun without context), ask for more scope.
Phase 1: Seed expansion
Assess which dimensions from WHO / WHAT / HOW / WHERE / WHEN / WHY / LIMITS are applicable using the multi-context table:
Generates 2–4 directions per applicable dimension plus cross-dimensional intersections. Maximum 25 initial directions.
| Dimension | Historical/social | Technical/scientific | Policy |
|---|---|---|---|
| WHO | Key people, institutions | Research groups, standards bodies | Agencies, legislators |
| WHAT | Events, phenomena | Techniques, architectures | Policies, regulations |
| HOW | Mechanisms, causation | Algorithms, protocols | Enforcement, incentives |
| WHERE | Geography, settings | Deployment environments | Jurisdictions |
| WHEN | Chronology, sequence | Maturity level, adoption windows | Legislative calendar |
| WHY | Motivations, drivers | Tradeoffs, design constraints | Political economy |
| LIMITS | Constraints, boundaries | Theoretical bounds, known failures | Enforcement gaps |
- 0 applicable dimensions → error; ask user to clarify
- 1–2 applicable dimensions → warn user; ask to confirm before proceeding
- 3+ applicable dimensions → proceed
max_rounds input.Phase 2: Initialize state
Creates
deep-research-state.json and deep-research-findings/ in the current working directory. Writes a lock file deep-research-{run_id}.lock before spawning any agents.Phase 3: Research rounds
Each round fires a prospective gate before any agents are spawned:Per round:
- Pop up to 6 highest-priority directions from the frontier
- Select model tier for each direction
- Spawn agents in parallel with an 8-minute timeout
- Collect all new directions from completed agents before deduplication
- Apply dedup against the stable pre-round frontier snapshot
- Update the coordinator summary
- Run round-level dimension re-assessment (corrects cold-start errors)
- Increment round counter
timed_out and are not re-queued.Phase 4: Fact verification
After the final research round, before synthesis:
- Claim extraction — identify the top N significant factual claims (N = min(20, total)). Risk-stratified sampling prioritises: single-source primary → numerical/statistical → contested → corroboration candidates.
- Citation spot-check — fetch each sampled URL; confirm the attributed claim appears in the source text. For numerical claims: compare exact numbers — semantic similarity is not accepted.
- Corroboration independence check — for claims cited by 3+ agents, verify the sources are from different organisations, dates, and methodologies.
Phase 5: Synthesize
Three-pass synthesis:If you accept,
- Mini-syntheses — each agent writes a mini-synthesis in its findings file.
- Theme extraction — coordinator reads mini-syntheses only (not raw findings). A theme is valid only if it requires findings from 2+ distinct dimensions.
- Final report — writes
deep-research-report.md.
/deep-qa audits the report for citation accuracy, logical consistency, coverage gaps, and counter-evidence gaps.Phase 6: Termination check
The run terminates when any of these is true (first condition wins):
- User stops — you chose N at a round gate
- Coverage plateau — no new dimensions for 3 consecutive rounds AND all frontier items have
exhaustion ≥ 4 - Budget soft gate —
max_roundsreached with non-empty frontier → you are prompted to extend - Frontier empties — all directions explored (possible since direction reporting is optional)
Self-review checklist
Before delivering output, verify all of the following:- State file is valid JSON after every round
- No direction has status
in_progressafter a round completes - Every findings file has: Findings, Source Table, Mini-Synthesis, New Directions (or “terminal node”), and Exhaustion Assessment
- No direction explored more than 2 times
- Prospective gate was shown before each round (or
--autowas set) - Coordinator summary updated each round in structured format, not freeform
- Fact verification ran before final synthesis
- Final report includes Spot-Check Sample Results section with explicit limitations
- Final report uses a termination label from the defined vocabulary
- Two separate confidence scores in the report: Coverage % and Evidence Quality
- Model tier correctly selected for each agent
Golden rules
1. Check state before spawning
1. Check state before spawning
Never spawn an agent without reading the state file first. Deduplicate every direction before adding it to the frontier.
2. Direction reporting is optional
2. Direction reporting is optional
A terminal node is valid output. Do NOT force agents to invent new directions to fill the slot.
3. Frontier is priority-ordered
3. Frontier is priority-ordered
Always pop the highest-priority direction first. Agent-discovered (child) directions receive a +2 depth bonus over same-tier siblings at the same depth level.
4. Two explorations maximum
4. Two explorations maximum
Each direction can be explored at most twice. A third attempt is skipped without re-queuing.
5. Prospective gate fires before spend
5. Prospective gate fires before spend
Never spawn agents without showing the user a cost estimate first — unless
--auto is set.6. Coordinator context is bounded
6. Coordinator context is bounded
Never accumulate raw findings in the coordinator. Use the structured coordinator summary to keep context size predictable.
7. Every finding needs a source
7. Every finding needs a source
Web search URLs required for every claim. Training-data-only findings are not accepted.
8. Always specify model tier explicitly
8. Always specify model tier explicitly
Never let agents default to a tier. Unintentional Opus usage is the primary source of cost spirals.
9. Verify numerics manually
9. Verify numerics manually
Flag all numerical claims in the spot-check. LLM number verification is unreliable — exact comparison is required.
Reference files
| File | Contents |
|---|---|
DFS.md | Dimension discovery, cross-product expansion, exhaustion map, frontier priority ordering, termination logic |
STATE.md | State file schema, direction schema, deduplication contract |
SYNTHESIS.md | Fact verification protocol, coordinator summary format, final report structure |
FORMAT.md | Final report format, coverage report, spot-check results section |