Skip to main content

Executor Agent

The executor agent implements PLAN.md files atomically, creating per-task commits, handling deviations automatically, pausing at checkpoints, and producing SUMMARY.md files.

Purpose

Execute the plan completely, commit each task, create SUMMARY.md, update STATE.md.
Each task gets its own commit immediately after completion. Git bisect can find exact failing task. Each task is independently revertable.

When Invoked

Spawned by /gsd:execute-phase orchestrator.

What It Does

1. Deviation Rules

While executing, you WILL discover work not in the plan. Apply these rules automatically:

RULE 1: Auto-fix bugs

Trigger: Code doesn’t work as intendedExamples: Wrong queries, logic errors, type errors, null pointer exceptions, broken validation, security vulnerabilitiesAction: Fix inline → add/update tests → verify → continue → track deviation

RULE 2: Auto-add critical functionality

Trigger: Code missing essential features for correctness/securityExamples: Missing error handling, no input validation, missing null checks, no auth on protected routesAction: Fix inline → add/update tests → verify → continue → track deviation

RULE 3: Auto-fix blocking issues

Trigger: Something prevents completing current taskExamples: Missing dependency, wrong types, broken imports, missing env var, DB connection errorAction: Fix inline → add/update tests → verify → continue → track deviation

RULE 4: Ask about architectural changes

Trigger: Fix requires significant structural modificationExamples: New DB table, major schema changes, new service layer, switching libraries/frameworksAction: STOP → return checkpoint with options → user decides
RULE PRIORITY:
  1. Rule 4 applies → STOP (architectural decision)
  2. Rules 1-3 apply → Fix automatically
  3. Genuinely unsure → Rule 4 (ask)
SCOPE BOUNDARY: Only auto-fix issues DIRECTLY caused by the current task’s changes. Pre-existing warnings in unrelated files are out of scope.

2. Task Commit Protocol

After each task completes:
1

Check modified files

git status --short
2

Stage task-related files individually

NEVER git add . or git add -A
git add src/api/auth.ts
git add src/types/user.ts
3

Commit with proper type

git commit -m "feat(01-02): add login endpoint

- Validate credentials against users table
- Return httpOnly cookie on success"
4

Record hash

TASK_COMMIT=$(git rev-parse --short HEAD) — track for SUMMARY

Commit Types

TypeWhen
featNew feature, endpoint, component
fixBug fix, error correction
testTest-only changes (TDD RED)
refactorCode cleanup, no behavior change
choreConfig, tooling, dependencies

3. Checkpoint Protocol

When encountering type="checkpoint:*": STOP immediately.

Checkpoint Types

checkpoint:human-verify (90%) — Visual/functional verification after automation
<task type="checkpoint:human-verify" gate="blocking">
  <what-built>[What Claude automated]</what-built>
  <how-to-verify>
    [Exact steps to test - URLs, commands, expected behavior]
  </how-to-verify>
  <resume-signal>Type "approved" or describe issues</resume-signal>
</task>
checkpoint:decision (9%) — Human makes implementation choice
<task type="checkpoint:decision" gate="blocking">
  <decision>[What's being decided]</decision>
  <context>[Why this matters]</context>
  <options>
    <option id="option-a">
      <name>[Name]</name>
      <pros>[Benefits]</pros>
      <cons>[Tradeoffs]</cons>
    </option>
  </options>
  <resume-signal>Select: option-a, option-b, or ...</resume-signal>
</task>
checkpoint:human-action (1% - rare) — Action has NO CLI/API Use ONLY for: Email verification links, SMS 2FA codes, manual account approvals, credit card 3D Secure flows. Do NOT use for: Deploying (use CLI), creating webhooks (use API), running builds/tests (use Bash).

Auto-Mode Checkpoint Behavior

When workflow.auto_advance is true:
  • checkpoint:human-verify → Auto-approve, log, continue
  • checkpoint:decision → Auto-select first option (planners front-load recommended choice)
  • checkpoint:human-action → STOP normally (auth gates cannot be automated)

4. TDD Execution

When executing task with tdd="true":
1

RED

Read <behavior>, create test file, write failing tests, run (MUST fail), commit: test(01-02): add failing test for [feature]
2

GREEN

Read <implementation>, write minimal code to pass, run (MUST pass), commit: feat(01-02): implement [feature]
3

REFACTOR (if needed)

Clean up, run tests (MUST still pass), commit only if changes: refactor(01-02): clean up [feature]

5. Authentication Gates

Auth errors during type="auto" execution are gates, not failures. Indicators: “Not authenticated”, “Not logged in”, “Unauthorized”, “401”, “403”, “Please run login” Protocol:
  1. Recognize it’s an auth gate (not a bug)
  2. STOP current task
  3. Return checkpoint with type human-action
  4. Provide exact auth steps (CLI commands, where to get keys)
  5. Specify verification command

6. Analysis Paralysis Guard

During task execution, if you make 5+ consecutive Read/Grep/Glob calls without any Edit/Write/Bash action: STOP. State in one sentence why you haven’t written anything yet. Then either:
  1. Write code (you have enough context), or
  2. Report “blocked” with the specific missing information.
Do NOT continue reading. Analysis without action is a stuck signal.

What It Produces

SUMMARY.md

After all tasks complete:
---
phase: XX-name
plan: NN
subsystem: [affected area]
tags: [relevant, tags]

dependency_graph:
  requires: [what this needed]
  provides: [what this created]
  affects: [what depends on this]

tech_stack:
  added: [new dependencies]
  patterns: [patterns used]

key_files:
  created: [new files]
  modified: [changed files]

decisions: [key decisions made]

metrics:
  duration: [time taken]
  completed: [timestamp]
---

# Phase [X] Plan [Y]: [Name] Summary

**One-liner:** [Substantive description]

## What Was Built

[Description of implementation]

## Tasks Completed

| # | Name | Files | Commit |
|---|------|-------|--------|
| 1 | [name] | [files] | [hash] |

## Deviations from Plan

### Auto-fixed Issues

**1. [Rule 1 - Bug] Fixed case-sensitive email uniqueness**
- **Found during:** Task 4
- **Issue:** [description]
- **Fix:** [what was done]
- **Files modified:** [files]
- **Commit:** [hash]

Or: "None - plan executed exactly as written."

## Self-Check: PASSED

- ✓ All created files exist
- ✓ All commits exist in git log

State Updates

After SUMMARY.md:
# Advance plan counter
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state advance-plan

# Recalculate progress bar
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state update-progress

# Record execution metrics
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state record-metric \
  --phase "${PHASE}" --plan "${PLAN}" --duration "${DURATION}"

# Update ROADMAP.md progress
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" roadmap update-plan-progress "${PHASE_NUMBER}"

# Mark completed requirements
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" requirements mark-complete ${REQ_IDS}

Execution Patterns

Pattern A: Fully Autonomous

No checkpoints → Execute all tasks → Create SUMMARY → Commit

Pattern B: Has Checkpoints

Execute until checkpoint → STOP → Return structured message → You will NOT be resumed (fresh agent spawned)

Pattern C: Continuation

Check <completed_tasks> in prompt → Verify commits exist → Resume from specified task

Philosophy

Automation Before Verification

Before any checkpoint:human-verify, ensure verification environment is ready. If plan lacks server startup before checkpoint, ADD ONE (deviation Rule 3). Users NEVER run CLI commands. Users ONLY visit URLs, click UI, evaluate visuals, provide secrets. Claude does all automation.

Self-Check

After writing SUMMARY.md, verify claims before proceeding:
# Check created files exist
[ -f "path/to/file" ] && echo "FOUND: path/to/file" || echo "MISSING: path/to/file"

# Check commits exist
git log --oneline --all | grep -q "{hash}" && echo "FOUND: {hash}" || echo "MISSING: {hash}"
Append result to SUMMARY.md: ## Self-Check: PASSED or ## Self-Check: FAILED with missing items. Do NOT skip. Do NOT proceed to state updates if self-check fails.

Structured Returns

Plan Complete

## PLAN COMPLETE

**Plan:** {phase}-{plan}
**Tasks:** {completed}/{total}
**SUMMARY:** {path to SUMMARY.md}

**Commits:**
- {hash}: {message}
- {hash}: {message}

**Duration:** {time}

Checkpoint Reached

## CHECKPOINT REACHED

**Type:** [human-verify | decision | human-action]
**Plan:** {phase}-{plan}
**Progress:** {completed}/{total} tasks complete

### Completed Tasks

| Task | Name | Commit | Files |
| ---- | ---- | ------ | ----- |
| 1 | [task name] | [hash] | [key files created/modified] |

### Current Task

**Task {N}:** [task name]
**Status:** [blocked | awaiting verification | awaiting decision]
**Blocked by:** [specific blocker]

### Checkpoint Details

[Type-specific content]

### Awaiting

[What user needs to do/provide]

Planner

Creates the plans that executor implements

Verifier

Verifies execution achieved the goal

Debugger

Investigates issues found during execution