Severity & Confidence Thresholds

Severity and confidence thresholds control which findings are shown, when builds fail, and how GitHub PRs are reviewed.

Severity Levels

Findings are classified into three severity levels:

high

Critical issues that should block merging.Examples:

Security vulnerabilities
Data corruption risks
Authentication bypasses
Memory leaks
Logic errors causing incorrect behavior

[defaults]
failOn = "high"  # Block merge on high severity

medium

Issues worth reviewing before merge.Examples:

Code quality problems
Potential bugs
Performance issues
Maintainability concerns
Missing error handling

[defaults]
reportOn = "medium"  # Show medium and high

low

Minor improvements and suggestions.Examples:

Style inconsistencies
Documentation suggestions
Minor optimizations
Refactoring opportunities

[defaults]
reportOn = "low"  # Show all findings

Legacy compatibility: Old findings may use critical (mapped to high) or info (mapped to low). These are automatically normalized.

Confidence Levels

Confidence indicates how certain the skill is about a finding:

high - Definite issue, very likely correct
medium - Probable issue, may need verification
low - Possible issue, human review recommended

[defaults]
minConfidence = "medium"  # Filter out low confidence findings

Findings without a confidence field are always included (backwards compatibility).

Threshold Configuration

failOn

enum

Exit with code 1 when findings meet this severity threshold.Values: "off", "high", "medium", "low"
Default: Not set (never fails)Effect:

CLI exits with code 1
GitHub Actions check fails (if failCheck = true)
GitHub review uses REQUEST_CHANGES (if requestChanges = true)

[defaults]
failOn = "high"  # Fail on high severity findings

reportOn

enum

Only show findings at or above this severity level.Values: "off", "high", "medium", "low"
Default: Shows all findings

[defaults]
reportOn = "medium"  # Hide low severity findings

reportOn is a display filter. It doesn’t affect failOn logic—if a low-severity finding triggers failOn = "low", the build still fails even if reportOn = "high" hides it from display.

minConfidence

enum

Filter out findings below this confidence level.Values: "off", "high", "medium", "low"
Default: "medium"

[defaults]
minConfidence = "high"  # Only show high confidence findings

Threshold Precedence

Thresholds can be set at three levels:

Trigger level (highest priority)
Skill level
Defaults level (lowest priority)

[defaults]
failOn = "medium"        # Priority 3

[[skills]]
name = "my-skill"
failOn = "high"          # Priority 2 (overrides defaults)

[[skills.triggers]]
type = "pull_request"
actions = ["opened"]
failOn = "low"           # Priority 1 (overrides skill and defaults)

Common Configurations

Strict CI (fail on high severity)

Block PRs with high severity issues:

[defaults]
failOn = "high"
reportOn = "medium"
minConfidence = "high"
requestChanges = true
failCheck = true

[[skills]]
name = "security-scanner"
paths = ["src/**/*.ts"]

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Lenient CI (informational only)

Show findings but never fail builds:

[defaults]
failOn = "off"           # Never fail
reportOn = "low"         # Show everything
minConfidence = "low"    # Show all confidence levels

[[skills]]
name = "code-quality"
paths = ["src/**/*.ts"]

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Progressive Strictness

Different thresholds for different skills:

[defaults]
failOn = "high"
reportOn = "medium"

# Critical security checks
[[skills]]
name = "security-scanner"
paths = ["src/auth/**", "src/payments/**"]
failOn = "high"          # Fail on high
minConfidence = "high"   # High confidence only

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

# General code quality
[[skills]]
name = "code-quality"
paths = ["src/**/*.ts"]
ignorepaths = ["src/auth/**", "src/payments/**"]
failOn = "medium"        # More lenient
minConfidence = "medium"

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Different Thresholds by Trigger

Strict on PR, lenient locally:

[[skills]]
name = "my-skill"
paths = ["src/**/*.ts"]

# Strict PR checks
[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]
failOn = "high"
minConfidence = "high"
requestChanges = true

# Lenient local checks
[[skills.triggers]]
type = "local"
failOn = "off"
reportOn = "low"
minConfidence = "low"

GitHub Integration

Threshold settings affect GitHub PR behavior:

requestChanges

boolean

Use REQUEST_CHANGES review event when findings exceed failOn threshold.Default: false

[defaults]
failOn = "high"
requestChanges = true  # Block PR with REQUEST_CHANGES

Effect:

GitHub shows “Changes requested” status
PR cannot be merged until changes are addressed
Requires re-review to clear

failCheck

boolean

Fail the GitHub Actions check run when findings exceed failOn threshold.Default: false

[defaults]
failOn = "high"
failCheck = true  # Fail CI check

Effect:

GitHub Actions check run shows red ❌
Can block PR merge if required checks are enabled

Behavior Matrix

failOn	requestChanges	failCheck	Findings	Result
`"high"`	`true`	`true`	High severity	Request changes + fail check
`"high"`	`true`	`false`	High severity	Request changes, check passes
`"high"`	`false`	`true`	High severity	Comment review, fail check
`"high"`	`false`	`false`	High severity	Comment review, check passes
`"off"`	any	any	Any	Never fails

Filtering Logic

Findings are filtered in this order:

Severity filter (reportOn)
- Include findings ≥ reportOn threshold
- reportOn = "off" excludes all findings
Confidence filter (minConfidence)
- Include findings ≥ minConfidence threshold
- minConfidence = "off" includes all confidence levels
- Findings without confidence are always included
Limit (maxFindings)
- Take first N findings after filtering

[defaults]
reportOn = "medium"      # Step 1: Filter to medium+ severity
minConfidence = "high"   # Step 2: Filter to high confidence
maxFindings = 50         # Step 3: Take first 50

CLI Override

Command-line flags override configuration:

# Override reportOn threshold
warden --report-on high

# Override failOn threshold
warden --fail-on medium

# Override maxFindings limit
warden --max-findings 100

# Show all findings regardless of config
warden --report-on low --fail-on off

Examples by Use Case

Security-First

Fail fast on any security issue:

[[skills]]
name = "security-scanner"
paths = ["src/**/*.ts"]
failOn = "high"
minConfidence = "high"
requestChanges = true
failCheck = true

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Code Quality Focus

Report quality issues without blocking:

[[skills]]
name = "code-quality"
paths = ["src/**/*.ts"]
failOn = "off"           # Never block
reportOn = "low"         # Show everything
minConfidence = "medium" # Filter low confidence

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Progressive Enforcement

Start lenient, tighten over time:

# Phase 1: Informational
[defaults]
failOn = "off"
reportOn = "low"

# Phase 2: Fail on high severity (after team adjusts)
# [defaults]
# failOn = "high"
# reportOn = "medium"

# Phase 3: Fail on medium severity (after code improves)
# [defaults]
# failOn = "medium"
# reportOn = "medium"

High-Confidence Only

Minimize false positives:

[defaults]
minConfidence = "high"   # Only show high confidence
reportOn = "medium"      # Medium+ severity
failOn = "high"          # Fail on high severity

[[skills]]
name = "my-skill"
paths = ["src/**/*.ts"]

[[skills.triggers]]
type = "pull_request"
actions = ["opened", "synchronize"]

Troubleshooting

No findings reported

Check these settings:

reportOn threshold too high:
```
reportOn = "low"  # Show all findings
```

minConfidence too high:

minConfidence = "low"  # Show all confidence levels

Use CLI flags to override:

warden --report-on low --min-confidence low

Build failing unexpectedly

Check failOn threshold:

# See which findings triggered failure
warden -v

# Temporarily disable failing
warden --fail-on off

Adjust threshold:

[defaults]
failOn = "high"  # Less strict

Too many low-severity findings

Filter display:

[defaults]
reportOn = "medium"      # Hide low severity
minConfidence = "medium" # Hide low confidence
maxFindings = 20         # Limit output

Best Practices

Start lenient, tighten gradually

# Week 1-2: Informational
failOn = "off"

# Week 3-4: Fail on high
failOn = "high"

# Week 5+: Fail on medium
failOn = "medium"

Different thresholds for different code

# Critical paths: strict
[[skills]]
name = "security"
paths = ["src/auth/**"]
failOn = "high"
minConfidence = "high"

# General code: lenient
[[skills]]
name = "quality"
paths = ["src/**"]
failOn = "medium"
minConfidence = "medium"

Use confidence to reduce noise

[defaults]
minConfidence = "medium"  # Filter speculative findings

Limit findings in early adoption

[defaults]
maxFindings = 20  # Prevent overwhelming output

Next Steps

Skill Configuration

Configure individual skills

Triggers

Control when skills run

GitHub Actions

Set up CI/CD integration

CLI Reference

Command-line options

Get Started

Core Concepts

CLI Usage

GitHub Action

Configuration

Skills

Advanced

Severity & Confidence Thresholds

Severity Levels

Confidence Levels

Threshold Configuration

failOn

reportOn

minConfidence

Threshold Precedence

Common Configurations

Strict CI (fail on high severity)

Lenient CI (informational only)

Progressive Strictness

Different Thresholds by Trigger

GitHub Integration

requestChanges

failCheck

Behavior Matrix

Filtering Logic

CLI Override

Examples by Use Case

Security-First

Code Quality Focus

Progressive Enforcement

High-Confidence Only

Troubleshooting

Best Practices

Next Steps

Skill Configuration

Triggers

GitHub Actions

CLI Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Usage

GitHub Action

Configuration

Skills

Advanced

​Severity Levels

​Confidence Levels

​Threshold Configuration

​failOn

​reportOn

​minConfidence

​Threshold Precedence

​Common Configurations

​Strict CI (fail on high severity)

​Lenient CI (informational only)

​Progressive Strictness

​Different Thresholds by Trigger

​GitHub Integration

​requestChanges

​failCheck

​Behavior Matrix

​Filtering Logic

​CLI Override

​Examples by Use Case

​Security-First

​Code Quality Focus

​Progressive Enforcement

​High-Confidence Only

​Troubleshooting

​Best Practices

​Next Steps

Skill Configuration

Triggers

GitHub Actions

CLI Reference

Build docs developers (and LLMs) love

Severity Levels

Confidence Levels

Threshold Configuration

failOn

reportOn

minConfidence

Threshold Precedence

Common Configurations

Strict CI (fail on high severity)

Lenient CI (informational only)

Progressive Strictness

Different Thresholds by Trigger

GitHub Integration

requestChanges

failCheck

Behavior Matrix

Filtering Logic

CLI Override

Examples by Use Case

Security-First

Code Quality Focus

Progressive Enforcement

High-Confidence Only

Troubleshooting

Best Practices

Next Steps