Adversarial Review Process

Adversarial review is a forced reasoning technique that eliminates superficial “looks good” reviews by requiring the reviewer to find issues.

What is Adversarial Review?

A review technique where the reviewer must find issues. No “looks good” allowed. The reviewer adopts a cynical, skeptical stance - assume problems exist and find them. This isn’t about being negative for its own sake. It’s about forcing genuine analysis instead of a cursory glance that rubber-stamps whatever was submitted.

The core rule: You must find issues. Zero findings triggers a halt - re-analyze or explain why nothing is wrong.

Why Standard Reviews Fail

Normal reviews suffer from predictable cognitive biases:

Confirmation Bias

You skim the work, nothing immediately jumps out, so you approve it. Your brain is looking for reasons to confirm “this is fine” rather than actively seeking problems.

Surface-Level Analysis

Without a forcing function, reviewers take the path of least resistance - checking syntax, formatting, obvious errors. Deeper issues like missing edge cases, security vulnerabilities, or architectural problems go unnoticed.

Authority Bias

If the author is senior or respected, reviewers unconsciously defer to their judgment. “They probably thought of this” becomes a reason to skip critical thinking.

Time Pressure

Reviews compete with other work. The easiest way to clear your queue is quick approval. Thorough analysis takes time most people don’t allocate.

How Adversarial Review Works

The “find problems” mandate breaks these patterns:

Forces Thoroughness

You can’t approve until you’ve looked hard enough to find issues. This naturally extends review time and deepens analysis.

Shifts the Question

Instead of “Is anything obviously broken?” you ask “What’s wrong with this?” and “What’s missing?” These questions surface different insights.

Catches Absence

Normal reviews focus on what’s present. Adversarial review asks “What should be here but isn’t?” - error handling, edge cases, validation, documentation, tests.

Improves Signal Quality

Findings are specific and actionable, not vague concerns. “This might have issues” becomes “Line 47: No rate limiting on failed login attempts.”

Information Asymmetry Advantage

Best results come from reviewing the artifact without access to original reasoning. You evaluate what’s actually there, not what the author intended.

Where It’s Used

Adversarial review appears throughout BMad workflows:

Code review - Find bugs, security issues, performance problems
Implementation readiness checks - Validate specs before building
Spec validation - Find gaps, contradictions, ambiguities
Architecture review - Surface conflicts, missing decisions
Test coverage - Identify untested scenarios
Documentation review - Find unclear explanations, missing info

Sometimes it’s a required step, sometimes optional (like advanced elicitation or party mode). The pattern adapts to whatever artifact needs scrutiny.

Example: Before and After

Standard Review

“The authentication implementation looks reasonable. Token-based auth with session management. Approved.”

Problems: Misses security vulnerabilities, doesn’t check edge cases, provides no specific feedback.

Adversarial Review

Issues Found:

HIGH - login.ts:47 - No rate limiting on failed login attempts (enables brute force attacks)

HIGH - auth.ts:123 - Session token stored in localStorage (vulnerable to XSS attacks, should use httpOnly cookies)

HIGH - password.ts:89 - Password validation happens client-side only (can be bypassed)

MEDIUM - login.ts:52 - No audit logging for failed login attempts (can’t detect attack patterns)

MEDIUM - session.ts:34 - Session timeout not implemented (sessions never expire)

MEDIUM - Missing: No account lockout after repeated failures

LOW - auth.ts:145 - Magic number 3600 should be named constant SESSION_TIMEOUT_SECONDS

LOW - login.ts:23 - Error messages reveal whether username exists (enables user enumeration)

Result: The first review would have shipped with multiple security vulnerabilities. The second caught eight issues, four of them high severity.

Human Filtering Required

Because the AI is instructed to find problems, it will find problems - even when they don’t actually exist.

Expect false positives: nitpicks dressed as issues, misunderstandings of intent, or outright hallucinated concerns. Don’t blindly accept all findings.

Types of False Positives

Misunderstood Context

“HIGH - No error handling for database connection”

Actual: Error handling exists in the connection pool layer, reviewer didn’t see it. Nitpicking

“MEDIUM - Variable name usr should be spelled out as user”

Actual: Minor style preference, not a real issue. Hallucinated Problems

“HIGH - Function doesn’t validate email format”

Actual: Email validation happens at the schema level, function correctly assumes input is pre-validated. Over-Engineering

“MEDIUM - Should add caching layer for performance”

Actual: Premature optimization, current performance is fine.

Your Role

You decide what’s real. Review each finding:

Dismiss - False positive or nitpick
Fix - Real issue that matters
Note - Valid point but not worth addressing now
Investigate - Uncertain, need to verify

The value isn’t in accepting every finding. The value is in being forced to think through each one, which surfaces real issues you would have missed.

Iteration and Diminishing Returns

After addressing findings, consider running adversarial review again:

First Pass

Catches obvious issues, missing pieces, common problems. Highest ROI.

Second Pass

Catches subtler issues that the first review missed or that were introduced by fixes. Still valuable.

Third Pass

Might catch a few more things, but increasingly dominated by false positives and nitpicks.

Fourth+ Pass

Diminishing returns. You’re mostly generating noise at this point.

Sweet spot: Two passes for critical code/specs, one pass for normal work. Stop when findings become mostly nitpicks.

Best Practices

For Maximum Effectiveness

Use information asymmetry - Don’t give the reviewer access to original reasoning, design docs, or discussions. Evaluate only what’s in the artifact. Review with fresh eyes - Wait a day before adversarial review of your own work. Mental distance helps you spot issues. Focus on high-impact areas - Apply adversarial review to security, data handling, public APIs, critical business logic. Less critical for internal utilities. Combine with other techniques - Use adversarial review, then apply advanced elicitation (pre-mortem analysis) to findings. Document patterns - Track common issues to improve future work and update templates/checklists.

Common Mistakes to Avoid

Accepting everything - You’ll fix non-issues and waste time. Filter ruthlessly. Dismissing everything - Defensiveness prevents learning. Consider each finding honestly. Infinite iteration - Know when to stop. Diminishing returns kick in fast. Wrong severity levels - Reviewer might mark nitpicks as HIGH. Recalibrate based on actual impact. No action tracking - Document real issues and track fixes. Don’t let valid findings get lost.

Integration with Workflows

In `implement`

After implementation, before marking complete:

Workflow: Implementation complete. Run adversarial code review?
You: Yes

Workflow: [Reviews code with cynical lens]

Workflow: Found 6 issues:
1. HIGH - Missing null check on user input
2. MEDIUM - No error logging
...

Address these before completing?

In `prd-co-write`

Validate requirements before finalizing:

Workflow: PRD draft complete. Run adversarial review?
You: Yes

Workflow: Found 8 gaps:
1. HIGH - User authentication not specified
2. HIGH - No error handling requirements
3. MEDIUM - Mobile responsiveness unclear
...

In `plan-build`

Validate architecture decisions:

Workflow: Architecture complete. Run adversarial review?
You: Yes

Workflow: Found 5 concerns:
1. HIGH - No decision on state management
2. MEDIUM - API versioning strategy missing
...

Measuring Success

How do you know adversarial review is working? Good indicators:

Finding 3-8 real issues per review (right range, not too few or too many)
Mix of severity levels (not all nitpicks, not all critical)
Issues you genuinely didn’t notice before
Improved quality in subsequent work (learning from patterns)
Reduced production bugs (catching issues earlier)

Warning signs:

Consistently finding 0-1 issues (not being thorough enough)
Finding 20+ issues (reviewer is nitpicking or hallucinating)
All findings are LOW severity (missing real problems)
Same issues appearing repeatedly (not learning from feedback)

Advanced Techniques

Role-Based Review

Review from different perspectives:

Security reviewer - Find vulnerabilities and attack vectors
Performance reviewer - Identify bottlenecks and inefficiencies
Maintainability reviewer - Spot complexity and technical debt
User experience reviewer - Find usability and accessibility issues

Each role surfaces different classes of problems.

Comparative Review

Review against alternatives:

Best practices - Does this follow industry standards?
Similar implementations - How does this compare to existing code?
Competitor solutions - What are we missing that others have?

Constraint-Based Review

Review assuming specific constraints:

What if traffic 10x overnight?
What if this needs to support 100 languages?
What if the database is unavailable?
What if malicious users attack this endpoint?

These hypotheticals surface missing resilience and scalability considerations.

Real-World Example

Here’s an adversarial review of an API endpoint implementation:

// Original Implementation
async function createUser(req: Request, res: Response) {
  const { email, password, name } = req.body;
  const user = await db.users.create({ email, password, name });
  res.json({ userId: user.id });
}

Adversarial Review Findings:

HIGH - No input validation on email, password, or name
HIGH - Password stored in plaintext (should be hashed)
HIGH - No authentication check (anyone can call this)
HIGH - No duplicate email check (allows multiple accounts)
MEDIUM - No error handling for database failures
MEDIUM - Returns 200 even on failure
MEDIUM - No audit logging of user creation
LOW - Response includes only userId (might want user object)

After Fixes:

async function createUser(req: Request, res: Response) {
  try {
    // Input validation
    const { email, password, name } = validateUserInput(req.body);
    
    // Check for duplicate
    const existing = await db.users.findByEmail(email);
    if (existing) {
      return res.status(409).json({ error: 'Email already exists' });
    }
    
    // Hash password
    const hashedPassword = await bcrypt.hash(password, 10);
    
    // Create user
    const user = await db.users.create({
      email,
      password: hashedPassword,
      name
    });
    
    // Audit log
    await auditLog.record('user_created', { userId: user.id });
    
    res.status(201).json({ 
      userId: user.id,
      email: user.email,
      name: user.name 
    });
  } catch (error) {
    logger.error('User creation failed', { error, email });
    res.status(500).json({ error: 'Failed to create user' });
  }
}

Adversarial review transformed a dangerous implementation into production-ready code.

Remember: Assume problems exist. Look for what’s missing, not just what’s wrong. Filter findings through your judgment, but take each seriously enough to think through.

Get Started

Tutorials

Core Concepts

How-To Guides

Workflows

Advanced Topics

​What is Adversarial Review?

​Why Standard Reviews Fail

​Confirmation Bias

​Surface-Level Analysis

​Authority Bias

​Time Pressure

​How Adversarial Review Works

​Forces Thoroughness

​Shifts the Question

​Catches Absence

​Improves Signal Quality

​Information Asymmetry Advantage

​Where It’s Used

​Example: Before and After

​Standard Review

​Adversarial Review

​Human Filtering Required

​Types of False Positives

​Your Role

​Iteration and Diminishing Returns

​First Pass

​Second Pass

​Third Pass

​Fourth+ Pass

​Best Practices

​For Maximum Effectiveness

​Common Mistakes to Avoid

​Integration with Workflows

​In implement

​In prd-co-write

​In plan-build

​Measuring Success

​Advanced Techniques

​Role-Based Review

​Comparative Review

​Constraint-Based Review

​Real-World Example

Build docs developers (and LLMs) love

What is Adversarial Review?

Why Standard Reviews Fail

Confirmation Bias

Surface-Level Analysis

Authority Bias

Time Pressure

How Adversarial Review Works

Forces Thoroughness

Shifts the Question

Catches Absence

Improves Signal Quality

Information Asymmetry Advantage

Where It’s Used

Example: Before and After

Standard Review

Adversarial Review

Human Filtering Required

Types of False Positives

Your Role

Iteration and Diminishing Returns

First Pass

Second Pass

Third Pass

Fourth+ Pass

Best Practices

For Maximum Effectiveness

Common Mistakes to Avoid

Integration with Workflows

In `implement`

In `prd-co-write`

In `plan-build`

Measuring Success

Advanced Techniques

Role-Based Review

Comparative Review

Constraint-Based Review

Real-World Example