Overview
The systematic-debugging skill provides a structured, evidence-based approach to debugging that prevents random guessing and ensures problems are properly understood before solving. It emphasizes reproducibility, isolation, root cause analysis, and verification.What This Skill Provides
- 4-phase debugging process: Reproduce → Isolate → Understand → Fix & Verify
- Root cause analysis: 5 Whys technique to find true causes, not symptoms
- Evidence-based investigation: Using logs, traces, and systematic testing
- Fix verification: Ensuring the fix works and doesn’t introduce new issues
- Debugging checklists: Before, during, and after investigation
4-Phase Debugging Process
Phase 1: Reproduce
Before fixing, reliably reproduce the issue. Checklist:- Document exact reproduction steps
- Identify reproduction rate (Always/Often/Sometimes/Rare)
- Note expected vs actual behavior
- Capture error messages and stack traces
- Always (100%)
- Often (50-90%)
- Sometimes (10-50%)
- Rare (under 10%)
Phase 2: Isolate
Narrow down the source. Isolation Questions:- When did this start happening?
- What changed recently?
- Does it happen in all environments?
- Can we reproduce with minimal code?
- What’s the smallest change that triggers it?
- Binary search (comment out half the code)
- Remove dependencies one by one
- Test in fresh environment
- Compare working vs broken versions
Phase 3: Understand
Find the root cause, not just symptoms. The 5 Whys Technique:- Why: [First observation]
- Why: [Deeper reason]
- Why: [Still deeper]
- Why: [Getting closer]
- Why: [Root cause]
- Why is the page slow? → API call takes 3 seconds
- Why does API take 3 seconds? → Database query is slow
- Why is query slow? → Missing index on user_id
- Why is index missing? → Migration didn’t run
- Why didn’t migration run? → Deployment script skipped migrations
Phase 4: Fix & Verify
Fix and verify it’s truly fixed. Fix Verification Checklist:- Bug no longer reproduces
- Related functionality still works
- No new issues introduced
- Test added to prevent regression
Debugging Checklists
Before Starting
- Can reproduce consistently
- Have minimal reproduction case
- Understand expected behavior
During Investigation
- Check recent changes (git log)
- Check logs for errors
- Add logging if needed
- Use debugger/breakpoints
After Fix
- Root cause documented
- Fix verified
- Regression test added
- Similar code checked
Common Debugging Commands
Use Cases
- Debugging complex production issues
- Investigating intermittent bugs
- Solving performance problems
- Analyzing crash dumps
- Root cause analysis for system failures
- Debugging race conditions
- Tracking down memory leaks
Anti-Patterns to Avoid
❌ Random changes - “Maybe if I change this…”❌ Ignoring evidence - “That can’t be the cause”
❌ Assuming - “It must be X” without proof
❌ Not reproducing first - Fixing blindly
❌ Stopping at symptoms - Not finding root cause
Root Cause vs Symptom
Symptom: Login button doesn’t workImmediate cause: API returns 401
Root cause: JWT token expired, refresh logic not implemented Symptom: Page loads slowly
Immediate cause: API call takes 3 seconds
Root cause: Missing database index Symptom: App crashes on startup
Immediate cause: Null pointer exception
Root cause: Config file not loaded, environment variable missing
Evidence-Based Investigation
Gather evidence:- Error messages and stack traces
- Application logs
- Network requests (browser DevTools)
- Database query logs
- System metrics (CPU, memory)
- Git history
- Based on evidence, not guesses
- Testable prediction
- Single variable change
- Change one thing at a time
- Document results
- Keep or discard hypothesis
Related Skills
- Plan Writing - Structured debugging plans
- Architecture - Understanding system design for debugging
- Performance Profiling - Performance debugging
- Vulnerability Scanner - Security issue debugging
Which Agents Use This Skill
- debugger - Primary agent for systematic debugging
- Other agents reference this skill when encountering complex bugs during their work
Debugging Patterns by Issue Type
Production Crash
- Check error monitoring (Sentry, etc.)
- Get stack trace and error message
- Check deployment logs
- Compare with last working version
- Reproduce in staging
Performance Issue
- Measure baseline performance
- Profile the application
- Identify bottlenecks
- Isolate slow component
- Optimize and measure again
Intermittent Bug
- Increase logging
- Look for timing/race conditions
- Check for external dependencies
- Test under load
- Add telemetry
Integration Issue
- Test components in isolation
- Check API contracts
- Verify data formats
- Test with mock data
- Check authentication/authorization
Verification Techniques
- Unit tests: Test the specific fix
- Integration tests: Test related functionality
- Regression tests: Prevent bug from returning
- Manual testing: Verify in real environment
- Load testing: Ensure fix works under load
