Skip to main content

Working with workflows

Claude Octopus structures work using the Double Diamond methodology—four phases that move from divergent exploration to convergent delivery. This guide covers workflow progression, quality gates, and choosing the right workflow.

The Double Diamond

Adapted from the UK Design Council’s framework, the Double Diamond ensures quality through structured phases:
   DISCOVER      DEFINE       DEVELOP      DELIVER

  (diverge)   (converge)   (diverge)   (converge)

    Probe       Grasp        Tangle        Ink

  Research → Requirements → Build → Validate

Four phases

Discover (Probe)

Purpose: Divergent research and explorationActivities:
  • Multi-provider research (Codex + Gemini + Claude)
  • Broad ecosystem analysis
  • Technology comparison
  • Best practices research
  • Community insights
Output: Research synthesis documentCommand: /octo:discover or /octo:probeVisual indicator: 🐙 🔍

Define (Grasp)

Purpose: Convergent consensus buildingActivities:
  • Synthesize research findings
  • Build consensus on approach
  • Define requirements clearly
  • Identify constraints
  • Establish success criteria
Output: Consensus document with requirementsCommand: /octo:define or /octo:graspVisual indicator: 🐙 🎯

Develop (Tangle)

Purpose: Divergent implementationActivities:
  • Multi-provider code generation
  • Implementation with quality gates
  • Testing and validation
  • Security review
  • Performance optimization
Output: Implementation with validation reportCommand: /octo:develop or /octo:tangleVisual indicator: 🐙 🛠️

Deliver (Ink)

Purpose: Convergent final validationActivities:
  • Quality assurance
  • Final synthesis
  • Documentation
  • Delivery certification
  • User acceptance
Output: Final delivery documentCommand: /octo:deliver or /octo:inkVisual indicator: 🐙 ✅

Running individual phases vs full workflows

Individual phases

Run phases individually for maximum control:
# Run just research
/octo:discover OAuth authentication patterns

# Run just definition
/octo:define requirements for OAuth implementation

# Run just implementation
/octo:develop OAuth authentication system

# Run just validation
/octo:deliver OAuth implementation
When to use:
  • You want to review output before proceeding
  • Requirements may change between phases
  • High-stakes features requiring oversight at each step
  • Learning or experimenting with the methodology
Example workflow:
# 1. Research first
/octo:discover caching strategies for high-traffic APIs
# → Review synthesis, identify Redis as top candidate

# 2. Define requirements
/octo:define Redis caching layer requirements
# → Review consensus, adjust constraints

# 3. Implement
/octo:develop Redis caching layer
# → Review implementation against requirements

# 4. Validate
/octo:deliver Redis caching implementation
# → Final go/no-go decision

Full workflow (Embrace)

Run all 4 phases automatically:
/octo:embrace build user authentication system
What happens:
  1. Discover: Multi-provider research
  2. Define: Consensus building on approach
  3. Develop: Implementation with quality gates
  4. Deliver: Final validation and review
When to use:
  • Clear requirements from the start
  • Trusted, well-understood features
  • Autonomous mode enabled (see below)
  • You want end-to-end workflow without interruptions

Autonomy modes

Configure how much oversight you want during embrace workflows:
Approval required after each phase
  • Maximum control and oversight
  • Review synthesis before proceeding to next phase
  • Best for critical features or learning
Example:
/octo:embrace build payment processing
# Pauses after Discover for approval
# → Review research synthesis
# → Approve to proceed to Define
# Pauses after Define for approval
# → Review consensus
# → Approve to proceed to Develop
# ...

Choosing the right workflow

Claude Octopus provides specialized workflows beyond the Double Diamond phases.

Workflow decision tree

Use: /octo:research or /octo:discoverWhat you get:
  • Multi-AI research (Codex + Gemini + Claude)
  • Comprehensive analysis of options
  • Trade-off evaluation
  • Best practice identification
Example:
/octo:research microservices patterns
Use: /octo:debateWhat you get:
  • Structured three-way AI debate
  • Technical perspective (Codex)
  • Ecosystem perspective (Gemini)
  • Moderator and synthesis (Claude)
  • Consensus score
Example:
/octo:debate Redis vs DynamoDB for session storage
Use: /octo:embraceWhat you get:
  • Full 4-phase workflow
  • Quality gates between phases
  • Multi-AI perspectives throughout
  • Configurable autonomy
Example:
/octo:embrace build payment processing
Use: /octo:reviewWhat you get:
  • Multi-AI code review
  • Security vulnerability detection
  • 4-dimension scoring (correctness, security, performance, maintainability)
  • Best practices enforcement
Example:
/octo:review src/auth.ts
Use: /octo:tddWhat you get:
  • Red-green-refactor discipline
  • Tests written before implementation
  • Incremental feature development
  • Continuous validation
Example:
/octo:tdd create user registration
Use: /octo:securityWhat you get:
  • OWASP Top 10 vulnerability scanning
  • Authentication/authorization review
  • Input validation checks
  • Red team analysis
Example:
/octo:security src/api/
Use: /octo:factoryWhat you get:
  • Autonomous spec-to-software pipeline
  • Holdout testing (80/20 split)
  • Satisfaction scoring
  • PASS/WARN/FAIL verdict
Example:
/octo:factory "build a CLI that converts CSV to JSON"
Use: /octo:debugWhat you get:
  • Systematic debugging
  • Evidence gathering
  • Root cause identification
  • Fix with verification
Example:
/octo:debug failing test in auth.spec.ts
Use: /octo:quickWhat you get:
  • Lightweight, single-phase execution
  • No multi-AI overhead
  • Fast results
Example:
/octo:quick add logging to auth.ts

Workflow progression and quality gates

Quality gates ensure sloppy work doesn’t advance to the next phase.

Quality gate thresholds

Discover

Gate: All providers responded successfullyChecks:
  • Codex CLI returned valid synthesis
  • Gemini CLI returned valid synthesis
  • Claude synthesis completed
Failure action:
  • Retry with timeout increase
  • Proceed with available providers
  • User review in semi-autonomous mode

Define

Gate: Consensus achieved (75%+ agreement)Checks:
  • Requirements clearly defined
  • Constraints identified
  • Success criteria established
  • 75% consensus across providers
Failure action:
  • Re-run define with clarifying questions
  • User review required

Develop

Gate: Security, performance, best practices validatedChecks:
  • No critical security issues
  • Performance within acceptable range
  • Best practices followed
  • Tests written and passing
Failure action:
  • Remediation with context from validation report
  • Re-run develop phase
  • User review in semi-autonomous mode

Deliver

Gate: Final quality certification passedChecks:
  • All acceptance criteria met
  • No blocking issues
  • Documentation complete
  • Go/no-go recommendation
Failure action:
  • Provide detailed failure report
  • Suggest remediation steps
  • User decision on next steps

75% consensus threshold

The develop (tangle) phase requires 75% consensus across AI providers before advancing: Example:
Codex: Recommends JWT-based authentication (confidence: 85%)
Gemini: Recommends OAuth 2.0 with PKCE (confidence: 90%)
Claude: Synthesizes to OAuth 2.0 with JWT tokens (confidence: 80%)

Consensus score: 78% ✓
→ Quality gate passed, proceed to deliver
If consensus < 75%:
Codex: Recommends Redis for caching (confidence: 60%)
Gemini: Recommends Memcached for caching (confidence: 65%)
Claude: Unable to synthesize clear recommendation

Consensus score: 62% ✗
→ Quality gate failed, re-run define phase with clarification

Examples from real use cases

Use case 1: API authentication research

Goal: Research OAuth 2.0 vs JWT authentication for a new API Workflow:
# 1. Research both options
/octo:discover OAuth 2.0 vs JWT authentication patterns
Output:
  • Codex analysis: Technical implementation details, security considerations
  • Gemini analysis: Ecosystem adoption, library support, community insights
  • Claude synthesis: Comparison table, recommendations based on use case
Outcome: Team chose OAuth 2.0 with JWT access tokens based on multi-AI synthesis

Use case 2: End-to-end feature development

Goal: Build complete user authentication system from research to delivery Workflow:
# Run full lifecycle in supervised mode
/octo:embrace build user authentication system
Progression:
  1. Discover (research): OAuth patterns, JWT, session management → Approved
  2. Define (consensus): OAuth 2.0 + JWT + refresh tokens → 82% consensus → Approved
  3. Develop (implementation): Auth endpoints, token generation, validation → Security validated → Approved
  4. Deliver (validation): Code review passed, security scan clean → Go recommendation → Shipped
Quality gates:
  • All 4 gates passed
  • No security issues found
  • Performance within acceptable range (< 200ms token validation)

Use case 3: Architectural decision debate

Goal: Decide between monorepo and microservices architecture Workflow:
# 3-round adversarial debate
/octo:debate -r 3 -d adversarial monorepo vs microservices
Debate structure:
  • Round 1: Opening arguments (Codex: microservices, Gemini: monorepo, Claude: moderates)
  • Round 2: Rebuttals and counterarguments
  • Round 3: Final synthesis and consensus
Outcome: 68% consensus for monorepo with future migration path to microservices

Use case 4: Security audit and remediation

Goal: Audit authentication module for OWASP vulnerabilities Workflow:
# 1. Security scan
/octo:security src/auth/

# 2. Code review for remediation
/octo:review src/auth/ --focus security

# 3. Validate fixes
/octo:deliver src/auth/
Findings:
  • 2 critical issues (JWT secret hardcoded, no rate limiting)
  • 3 medium issues (weak password validation, missing CSRF protection)
  • Remediation applied with adversarial review
  • Final scan: 0 critical issues

Use case 5: Spec-to-software pipeline

Goal: Build CLI tool from specification with holdout testing Workflow:
# Dark Factory mode with custom satisfaction target
/octo:factory --spec ./specs/csv-to-json-cli.md --satisfaction-target 0.90
Pipeline execution:
  1. Parse spec → extracted 12 behaviors
  2. Generate scenarios → 30 test scenarios created
  3. Split holdout → 24 training, 6 blind scenarios
  4. Embrace workflow → full implementation
  5. Holdout tests → 5/6 passed (83% holdout accuracy)
  6. Score satisfaction → 0.87 composite score
  7. Report → WARN verdict (below 0.90 target)
Remediation:
  • Reviewed failed holdout scenario
  • Re-ran factory with refined spec
  • Second run: 0.92 composite score → PASS

Next steps

Using commands

Learn command structure, the smart router, and command composition

Configuring providers

Set up Codex, Gemini, and configure provider selection

Configuration

Environment variables, autonomy modes, and custom hooks

Architecture

Deep dive into the Double Diamond methodology

Build docs developers (and LLMs) love