Debugging

Why /gsd:debug Exists
How It Works
Basic Usage
Start new debug session
Resume active session
Debug File Structure
Checkpoint Types
Handling Debugger Returns
Continuation Flow
Modes
find_and_fix (default)
investigate_only
Examples
Best Practices
Do:
Don’t:
When to Use /gsd:debug vs /gsd:quick
Integration with Phases
Model Selection
Archival
Context Efficiency

GSD’s debugging system uses the scientific method with subagent isolation to investigate issues systematically while keeping your main context lean.

Why /gsd:debug Exists

Debugging in chat sessions burns context fast - you read files, form hypotheses, test theories, and accumulate investigation artifacts. By the time you find the root cause, your context is polluted. GSD isolates debugging in fresh subagent contexts with persistent state across investigations.

How It Works

Gather symptoms

The orchestrator asks structured questions about the issue:

Expected behavior
Actual behavior
Error messages
Timeline (when did it start?)
Reproduction steps

Spawn debugger agent

A fresh gsd-debugger agent gets:

Full symptom context
200K clean context window
Instructions to write findings to .planning/debug/{slug}.md

Investigation

The debugger uses scientific method:

Form hypothesis
Design test
Execute test
Record evidence
Confirm or refute
Repeat until root cause found

Checkpoints

When human input is needed, the debugger writes a checkpoint and returns. You respond, and a continuation agent spawns with the debug file as context.

Root cause or fix

When found, the debugger either:

Offers to spawn a fix agent immediately
Suggests creating a phase to address it properly
Leaves it for manual intervention

Basic Usage

Start new debug session

/gsd:debug "Users can't log in after password reset"

The system will ask follow-up questions to gather symptoms, then spawn the debugger.

Resume active session

/gsd:debug

If active sessions exist, GSD lists them with:

Current hypothesis
Evidence gathered
Next action

You can pick one to resume or start a new investigation.

Debug File Structure

Each debug session creates a persistent file:

.planning/debug/
  users-cant-login.md          # Active investigation
  api-timeout-errors.md        # Active investigation
  resolved/
    memory-leak-dashboard.md   # Archived after resolution

The debug file contains:

## Issue: Users Can't Log In

### Symptoms
- Expected: Login form submits, user redirected to dashboard
- Actual: Form submits, shows "Invalid credentials" even with correct password
- Timeline: Started 2024-03-15 around 14:00 UTC
- Errors: None in frontend, 401 from /api/auth/login

### Investigation

#### Hypothesis 1: Password hashing mismatch
Test: Check if recent password hash algorithm change affects verification
Evidence: bcrypt rounds increased from 10 to 12 in commit abc123
Status: REFUTED - old passwords still hash correctly

#### Hypothesis 2: JWT secret rotated without user session migration
Test: Check if JWT_SECRET environment variable changed
Evidence: JWT_SECRET changed in deployment yesterday, sessions not invalidated
Status: CONFIRMED

### Root Cause
JWT_SECRET was rotated but existing sessions weren't cleared. Users with old JWTs can't authenticate.

### Fix
Invalidate all sessions and force re-login on next request.

Checkpoint Types

The debugger may return checkpoints for human input:

Checkpoint Type	When Used	Example
`human-verify`	Need you to test something	”Can you try logging in now?”
`clarification`	Need more context	”What’s the exact error message?”
`permission`	Need approval to proceed	”This requires database access. Proceed?”
`decision`	Multiple paths forward	”Fix now or plan properly?”

Handling Debugger Returns

Root Cause Found
Checkpoint Reached
Investigation Inconclusive

## ROOT CAUSE FOUND

JWT_SECRET was rotated without invalidating existing sessions.
Users with old tokens can't authenticate.

Evidence: Environment variable changed in production deployment.

Options:
1. Fix now - spawn fix agent
2. Plan fix - create phase for proper migration
3. Manual fix - you handle it

GSD offers next steps based on severity and complexity.

## CHECKPOINT REACHED

Type: human-verify

I've added session invalidation logic to the login endpoint.
Can you try logging in now and confirm it works?

You respond, then GSD spawns a continuation agent that:

Reads the debug file
Sees your response
Continues from where it left off

## INVESTIGATION INCONCLUSIVE

Checked:
- Password hashing (no issues)
- JWT configuration (looks correct)
- Database connectivity (healthy)

Unable to reproduce the issue in development.

Options:
1. Continue investigating with more context
2. Manual investigation recommended
3. Add logging and monitor production

Sometimes issues are environmental or intermittent.

Continuation Flow

When you respond to a checkpoint:

# Debugger asks: "Can you try logging in now?"
> Yes, it works!

# GSD spawns continuation agent
Task(
  prompt="Continue debugging users-cant-login. Evidence in debug file.",
  checkpoint_response="Yes, it works!",
  subagent_type="gsd-debugger"
)

The continuation agent:

Reads .planning/debug/users-cant-login.md
Sees the checkpoint
Processes your response
Decides whether to finalize/resolve or continue investigating

Modes

The debugger has two modes:

`find_and_fix` (default)

Investigates and fixes if the fix is simple and safe:

Single-file changes
Config adjustments
Clear, low-risk fixes

For complex fixes, it recommends planning a proper phase.

`investigate_only`

Investigates but never touches code. Use when:

You want diagnosis only
The fix requires careful planning
Multiple services are involved

Examples

/gsd:debug "500 errors on /api/users endpoint"

# Symptoms:
# Expected: Returns user list
# Actual: 500 Internal Server Error
# Timeline: Started 30 minutes ago
# Errors: "Cannot read property 'map' of undefined"

# Debugger investigates, finds null check missing
# Offers to fix immediately

Best Practices

Do:

Be specific when describing symptoms
Include error messages verbatim
Note timeline (when did it start?)
Provide reproduction steps if known
Let debugger finish before trying fixes

Don’t:

Start debugging in main chat (wastes context)
Assume root cause (let debugger investigate)
Fix without understanding (may mask deeper issues)
Ignore checkpoints (debugger needs feedback)

When to Use /gsd:debug vs /gsd:quick

Integration with Phases

If debugging reveals a complex issue:

# Debug finds root cause
/gsd:debug "Memory leak in background jobs"

# Debugger recommends:
> This requires refactoring job queue architecture.
> Recommend: /gsd:add-phase "Refactor background job system"

# Add to roadmap
/gsd:add-phase "Refactor background job system"

# Plan properly
/gsd:discuss-phase 14
/gsd:plan-phase 14
/gsd:execute-phase 14

The debug file becomes part of the phase research context.

Model Selection

The debugger agent respects your model profile:

Profile	Debugger Model
quality	Opus
balanced	Sonnet
budget	Sonnet

Debugging gets Opus in quality mode because investigation quality matters more than speed.

Archival

When a debug session resolves:

# Debugger marks session resolved
## STATUS: RESOLVED

Root cause: JWT_SECRET rotation without session migration
Fix: Added session invalidation on next request
Commit: abc123f

# File moved to resolved/
mv .planning/debug/users-cant-login.md \
   .planning/debug/resolved/users-cant-login.md

Resolved sessions are kept for reference and postmortems.

Context Efficiency

The debug orchestrator stays lean:

Gathers symptoms - Structured questions
Spawns agent - Fresh 200K context
Receives checkpoint - Brief summary only
Spawns continuation - Fresh context again

Your main session never sees the investigation details - only the final result.

Debug sessions can spawn multiple continuation agents if needed. Each gets a fresh context, ensuring investigation quality never degrades.

Quick Mode

Phase Management

⌘I

Overview

Getting Started

Core Concepts

Workflow

Guides

Advanced

Why /gsd:debug Exists

How It Works

Basic Usage

Start new debug session

Resume active session

Debug File Structure

Checkpoint Types

Handling Debugger Returns

Continuation Flow

Modes

`find_and_fix` (default)

`investigate_only`

Examples

Best Practices

Do:

Don’t:

When to Use /gsd:debug vs /gsd:quick

Integration with Phases

Model Selection

Archival

Context Efficiency

Overview

Getting Started

Core Concepts

Workflow

Guides

Advanced

​Why /gsd:debug Exists

​How It Works

​Basic Usage

​Start new debug session

​Resume active session

​Debug File Structure

​Checkpoint Types

​Handling Debugger Returns

​Continuation Flow

​Modes

​find_and_fix (default)

​investigate_only

​Examples

​Best Practices

​Do:

​Don’t:

​When to Use /gsd:debug vs /gsd:quick

​Integration with Phases

​Model Selection

​Archival

​Context Efficiency

Why /gsd:debug Exists

How It Works

Basic Usage

Start new debug session

Resume active session

Debug File Structure

Checkpoint Types

Handling Debugger Returns

Continuation Flow

Modes

`find_and_fix` (default)

`investigate_only`

Examples

Best Practices

Do:

Don’t:

When to Use /gsd:debug vs /gsd:quick

Integration with Phases

Model Selection

Archival

Context Efficiency