Builtin Skills

Warden includes several builtin skills in the .agents/skills/ directory. These serve as examples and can be used directly in your projects.

find-warden-bugs

Purpose: Detect bugs at Warden’s architectural seams based on 40+ historical fix commits. Location: .agents/skills/find-warden-bugs/SKILL.md

What It Detects

This skill targets recurring bug patterns in Warden’s architecture:

Check 1: SDK Response Shape Assumptions

Severity: HighClaude SDK responses have specific shapes that have caused repeated issues. The skill detects:

Accessing response.content[0] without checking array length or block type
Accessing msg.usage.input_tokens without null check on usage
Type predicates that silently filter unknown content types
Accessing cache_read_input_tokens without handling null
Parsing SDKResultMessage without checking is_error or subtype

Example vulnerability:

// Unsafe - content could be empty
const text = response.content[0].text;

// Safe - checks exist
if (response.content[0]?.type === 'text') {
  const text = response.content[0].text;
}

Check 2: Dual Code Path Desync

Severity: HighWarden has two paths that build SkillReport objects:

runSkill() in src/sdk/analyze.ts (SDK/action)
runSkillTask() in src/cli/output/tasks.ts (CLI)

The skill detects when changes to one path aren’t mirrored in the other:

Adding fields to SkillReport in only one path
Different post-processing logic
Inconsistent error handling

Example issue:

// src/sdk/analyze.ts adds new field
return { ...baseReport, newField: value };

// But src/cli/output/tasks.ts doesn't
return { ...baseReport }; // Missing newField!

Check 3: Config Threading & Default Semantics

Severity: HighConfig flows through a 3-level merge chain. The skill detects:

Breaking merge precedence (trigger > skill > defaults)
Using || when ?? is needed (0/false/"" are valid)
New config fields not threaded through resolveSkillConfigs()
emptyToUndefined() not applied to GitHub Actions inputs

Example bug:

// Wrong - treats 0 as falsy
const limit = config.limit || 100;

// Right - treats 0 as valid
const limit = config.limit ?? 100;

Check 4: Concurrent Task & Ink Rendering

Severity: HighSkills run concurrently via runPool() while Ink renders a live UI. Detects:

Mutating shared state from callbacks without synchronization
Sort comparators accessing external mutable state
Writing to process.stderr while Ink is rendering
Not checking shouldAbort() after semaphore acquisition

Example race:

// Unsafe - shared array mutated during sort
const sorted = findings.sort((a, b) => {
  // findings array might change during sort!
});

// Safe - snapshot first
const sorted = [...findings].sort((a, b) => ...);

Check 5: Output Rendering Consistency

Severity: MediumWarden renders in multiple formats (terminal, JSON, JSONL, GitHub checks). Detects:

Display filtering applied before JSON serialization
--json flag short-circuiting before all findings collected
Reading log files that weren’t verified to exist
GitHub annotations built from filtered findings

Example issue:

// Wrong - filters before JSON output
const displayed = findings.filter(f => f.severity === 'high');
if (opts.json) return JSON.stringify(displayed);

// Right - filter only for display
if (opts.json) return JSON.stringify(findings);
const displayed = findings.filter(...);

Other Checks

The skill includes additional checks for:

Check 6: Scope & filtering logic (hunk line validation)
Check 7: Early-exit path completeness (cleanup, output writes)
Check 8: State tracking accuracy (counting operations correctly)
Check 9: Error context & control flow (preserving error types)

Usage

warden.toml

[[skills]]
name = "find-warden-bugs"
paths = ["src/**/*.ts"]
ignorePaths = ["src/**/*.test.ts"]

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Lessons for Your Skills

Check-based structure: Each check targets a specific historical pattern
Zone classification: Only run relevant checks based on file paths
Safe patterns section: Reduces false positives
Historical context: “Historical commits: 8+” shows this is a real problem
Severity tied to impact: High = normal usage breaks, Medium = edge cases

architecture-review

Purpose: Staff-level codebase health review. Finds structural issues that compound over time. Location: .agents/skills/architecture-review/SKILL.md

What It Analyzes

1. Module Complexity

Finds files that have grown too large or do too much:

Files >500 lines (investigate >800)
Modules with >3 distinct responsibilities
High fan-out (importing from 10+ modules)

Proposes splits with specific new file names and responsibilities.

2. Silent Failure Patterns

Code that fails without indication:

Catch blocks returning defaults without logging
Functions returning [] or null where caller can’t distinguish error from empty
Missing error callbacks on async operations
Silent fallbacks hiding upstream problems

3. Type Safety Gaps

Places where TypeScript safety is bypassed:

as SomeType without runtime validation
Regex match assertions without checking capture groups
Optional chaining (?.) hiding null sources
Generic index access: obj[key]

4. Test Coverage Analysis

Maps tested vs critical:

Untested critical paths (core logic, orchestration, error handling)
Edge case gaps (empty inputs, null values, boundaries)
Integration gaps (cross-module flows with only unit tests)
Regression coverage (bug fixes without tests)

5. LLM-Friendliness

How well the code supports AI-assisted development:

JSDoc coverage on exports
Naming clarity (understandable without reading implementation)
Actionable error messages
Configuration footguns

Usage

# Run on entire codebase
warden run --skill architecture-review --schedule

# Or via config

warden.toml

[[skills]]
name = "architecture-review"

[[skills.triggers]]
type = "schedule"
cron = "0 0 * * 0"  # Weekly on Sunday

Output Format

Generates a structured report:

### Executive Summary
- 3-5 bullet points of most impactful findings

### Priority 1: [Category] (High Impact)
**Problem**: What's wrong and why it matters
**Evidence**: Specific files, line numbers, patterns
**Recommendation**: Concrete fix with structure

### What's Working Well
Architectural strengths to preserve

Lessons for Your Skills

Macro over micro: Focus on structural issues, not style preferences
Pre-report checklist: Validates work before reporting
Risk prioritization: Hot paths > edge cases > utilities
Positive feedback: “What’s Working Well” preserves good patterns

testing-guidelines

Purpose: Guide for writing tests. Used when adding functionality or fixing bugs. Location: .agents/skills/testing-guidelines/SKILL.md

Core Principles

Mock External Services, Use Real Fixtures

Always mock third-party network services. Always use fixtures based on real-world data (sanitized).

Prefer Integration Tests Over Unit Tests

Focus on end-to-end tests validating inputs and outputs, not implementation details.

Always Add Regression Tests for Bugs

When a bug is found, add a test that would have caught it. Test should fail before fix, pass after.

Cover Every User Entry Point

At least one basic test for each CLI command, API endpoint, exported function.

Usage

Reference this skill when writing tests:

# As a developer
warden run --skill testing-guidelines --help

# In agent instructions
"When writing tests, follow /testing-guidelines skill"

Lessons for Your Skills

Principle-based: Clear numbered principles, not exhaustive checklists
Concrete examples: Shows fixture format, test structure
Checklist: Pre-submission validation

agent-prompt

Purpose: Reference guide for writing effective agent prompts and skills. Location: .agents/skills/agent-prompt/SKILL.md

Structure

This skill acts as a router to detailed reference files:

.agents/skills/agent-prompt/
├── SKILL.md                    # Main router
└── references/
    ├── core-principles.md      # Foundational rules
    ├── skill-structure.md      # SKILL.md format
    ├── system-prompts.md       # Architecture guide
    ├── output-formats.md       # Structured JSON
    ├── agentic-patterns.md     # Tool-using agents
    ├── anti-patterns.md        # Common mistakes
    ├── model-guidance.md       # Claude 4.x optimization
    └── context-design.md       # Research notes

Usage

# Get help writing a skill
warden run --skill agent-prompt "How do I structure a security skill?"

# Or use in agent tools
"Read the agent-prompt skill for guidance on prompt engineering"

Lessons for Your Skills

Reference architecture: Main skill routes to specialized docs
Table-based routing: “Read X when doing Y” guides context selection
Bundled resources: References live alongside the skill

Using Builtin Skills

Reference in Your Config

warden.toml

# Use builtin skills directly by name
[[skills]]
name = "find-warden-bugs"

[[skills.triggers]]
type = "pull_request"

Study Their Patterns

The builtin skills demonstrate:

Specificity: Each targets concrete, provable patterns
Structure: Check-based or principle-based organization
Examples: Red flags, safe patterns, not-a-bug sections
Calibration: Confidence thresholds and severity guidance
Historical grounding: Reference past bugs to sharpen detection

Copy and Adapt

Use builtin skills as templates:

# Copy a builtin skill to customize
cp -r .agents/skills/find-warden-bugs .agents/skills/find-myapp-bugs

# Edit to target your codebase patterns
vim .agents/skills/find-myapp-bugs/SKILL.md

Get Started

Core Concepts

CLI Usage

GitHub Action

Configuration

Skills

Advanced

Builtin Skills

find-warden-bugs

What It Detects

Usage

Lessons for Your Skills

architecture-review

What It Analyzes

Usage

Output Format

Lessons for Your Skills

testing-guidelines

Core Principles

Usage

Lessons for Your Skills

agent-prompt

Structure

Usage

Lessons for Your Skills

Using Builtin Skills

Reference in Your Config

Study Their Patterns

Copy and Adapt

Next Steps

Creating Skills

Remote Skills

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Usage

GitHub Action

Configuration

Skills

Advanced

​find-warden-bugs

​What It Detects

​Usage

​Lessons for Your Skills

​architecture-review

​What It Analyzes

​Usage

​Output Format

​Lessons for Your Skills

​testing-guidelines

​Core Principles

​Usage

​Lessons for Your Skills

​agent-prompt

​Structure

​Usage

​Lessons for Your Skills

​Using Builtin Skills

​Reference in Your Config

​Study Their Patterns

​Copy and Adapt

​Next Steps

Creating Skills

Remote Skills

Build docs developers (and LLMs) love

find-warden-bugs

What It Detects

Usage

Lessons for Your Skills

architecture-review

What It Analyzes

Usage

Output Format

Lessons for Your Skills

testing-guidelines

Core Principles

Usage

Lessons for Your Skills

agent-prompt

Structure

Usage

Lessons for Your Skills

Using Builtin Skills

Reference in Your Config

Study Their Patterns

Copy and Adapt

Next Steps