Skip to main content

What are Accessibility Snapshots?

Accessibility snapshots are structured representations of web page content based on the browser’s accessibility tree. Instead of sending screenshots to LLMs, Playwright MCP captures the semantic structure of the page in a text format that includes:
  • Element roles (button, link, heading, etc.)
  • Accessible names (visible labels)
  • Element states (checked, disabled, expanded)
  • Text content
  • Interactive elements only (no decorative content)

Why Snapshots Beat Screenshots

Playwright MCP takes an accessibility-first approach to page representation:

Screenshots

Visual pixel data✗ Large file sizes (100KB - 5MB)✗ Requires vision models✗ Imprecise coordinate guessing✗ Slow to process✗ Affected by layout changes

Accessibility Snapshots

Structured text data✓ Small size (5KB - 50KB)✓ Works with standard LLMs✓ Precise element references✓ Fast to process✓ Layout-independent

Token Efficiency

Accessibility snapshots are dramatically more efficient:
Screenshot: ~10,000-50,000 tokens (with vision model)
Snapshot: ~500-5,000 tokens (text-only)

10x-100x reduction in context usage
This token efficiency is why Playwright MCP is described as “Fast and lightweight” - it uses structured data instead of pixel-based input.

Snapshot Structure

Here’s an example of what an accessibility snapshot looks like:
# Page Snapshot

URL: https://example.com/login
Title: Login - Example Site

## Content

- heading "Login" [level=1]
- form "Sign in" [ref=1]
  - textbox "Email" [ref=2]
  - textbox "Password" [type=password] [ref=3]
  - checkbox "Remember me" [ref=4]
  - button "Sign In" [ref=5]
  - link "Forgot password?" [ref=6]
- text: "Don't have an account?"
- link "Create account" [ref=7]
- text: "© 2024 Example Inc."

Key Components

Element Roles The accessibility role describes what an element is:
  • button - Clickable button
  • link - Hyperlink
  • textbox - Text input field
  • checkbox, radio - Selection controls
  • heading - Heading (with level attribute)
  • list, listitem - Lists
  • menu, menuitem - Menus
Accessible Names The quoted text after the role is the element’s accessible name (what screen readers announce):
- button "Submit Form"    ← Accessible name
- link "Learn More"       ← Accessible name
- textbox "Search"        ← Accessible name (from label/placeholder)
Attributes Additional attributes appear in square brackets:
- checkbox "Terms" [checked] [ref=8]
- button "Submit" [disabled] [ref=9]
- heading "Welcome" [level=2] [ref=10]
- textbox "Password" [type=password] [ref=11]
Hierarchical Structure Indentation shows the DOM hierarchy:
- navigation [ref=1]
  - link "Home" [ref=2]
  - link "Products" [ref=3]
    - menu [ref=4]
      - menuitem "Software" [ref=5]
      - menuitem "Hardware" [ref=6]

Element References (ref parameter)

The [ref=N] attribute is the most important part of snapshots - it provides a stable reference to interact with elements.

How References Work

Each interactive element in the snapshot gets a unique reference number:
- button "Add to Cart" [ref=12]
- link "Product Details" [ref=13]
- textbox "Quantity" [ref=14]
These references are used in tool calls to target specific elements:
// Click the "Add to Cart" button
browser_click({
  element: "Add to Cart button",
  ref: "12"
})

// Type into the quantity field
browser_type({
  element: "Quantity textbox",
  ref: "14",
  text: "3"
})
The element parameter is a human-readable description for permission prompts. The ref parameter is the exact element reference from the snapshot.

Why References are Better Than Selectors

Traditional selector approach:
// Fragile - breaks if class names or structure changes
click('#add-to-cart-btn')
click('button.btn-primary.btn-large')
click('div.container > button:nth-child(2)')
Playwright MCP reference approach:
// Stable - based on accessibility tree
browser_click({ ref: "12" })
References are:
  • Stable - Based on semantic structure, not implementation details
  • Resilient - Work across layout changes
  • Unambiguous - Exact element targeting
  • Fast - Direct element lookup

Reference Lifetime

References are valid for the current snapshot:
After page navigation or major DOM changes, take a new snapshot. Previous references may no longer be valid.

Snapshot Modes

Playwright MCP supports three snapshot modes:

Incremental Mode (Default)

Only returns changes since the last snapshot:
# Incremental Snapshot

Changed elements:
- button "Submit" [disabled] [ref=5]  ← State changed
+ alert "Form submitted successfully!" [ref=8]  ← New element
- progress "Uploading" [value=100] [ref=7]  ← Updated
Benefits:
  • Minimal token usage
  • Fast processing
  • Shows what changed
Use when:
  • Performing multi-step workflows
  • Making incremental changes
  • Token efficiency matters

Full Mode

Returns the complete accessibility tree every time:
# Full Snapshot

URL: https://example.com/page
Title: Current Page

## Content

[Complete accessibility tree...]
Benefits:
  • Complete context
  • No dependency on previous state
  • Easier to reason about
Use when:
  • Starting new task sequences
  • Debugging issues
  • Context is more important than tokens
Enable full mode:
npx @playwright/mcp@latest --snapshot-mode=full

None Mode

Disables automatic snapshots:
npx @playwright/mcp@latest --snapshot-mode=none
Use when:
  • You only need specific data (not full page structure)
  • Building custom tools
  • Optimizing for specific workflows
You can still request snapshots explicitly using browser_snapshot().

Example Snapshot Workflow

Here’s how an LLM uses snapshots to fill out a login form: Step 1: Get initial snapshot
browser_navigate({ url: "https://example.com/login" })

// Returns snapshot:
// - textbox "Email" [ref=1]
// - textbox "Password" [type=password] [ref=2]
// - button "Sign In" [ref=3]
Step 2: Fill email field
browser_type({
  element: "Email textbox",
  ref: "1",
  text: "[email protected]"
})

// Returns incremental snapshot showing filled state:
// - textbox "Email" [value="[email protected]"] [ref=1]
Step 3: Fill password field
browser_type({
  element: "Password textbox",
  ref: "2",
  text: "secretpassword",
  submit: false
})
Step 4: Click sign in
browser_click({
  element: "Sign In button",
  ref: "3"
})

// Returns new snapshot after navigation:
// - heading "Welcome, User!" [level=1]
// - link "Dashboard" [ref=5]
// - button "Logout" [ref=6]
The LLM uses accessible names (“Email”, “Password”, “Sign In”) to understand what elements do, but uses refs (1, 2, 3) to interact with them precisely.

Snapshot Filtering

Accessibility snapshots automatically filter out:
  • Decorative elements - Images without alt text, spacers, dividers
  • Hidden elements - display: none, visibility: hidden
  • Aria-hidden elements - Explicitly hidden from screen readers
  • Presentation roles - Elements marked as presentational
  • Redundant text - Duplicate content, whitespace
This keeps snapshots focused on interactive, meaningful content.

Combining with Screenshots

While snapshots are preferred, screenshots can be useful for:
  • Visual verification
  • Debugging layout issues
  • Capturing images and graphics
  • Complex visual elements
// Take snapshot for interaction
browser_snapshot()

// Take screenshot for visual verification
browser_take_screenshot({
  filename: "verification.png"
})
You cannot perform actions based on screenshots. Always use browser_snapshot() to get element references for interaction.

Advanced: Snapshot Optimization

Test ID Attributes

Use test IDs for more stable element targeting:
<button data-testid="submit-form">Submit</button>
npx @playwright/mcp@latest --test-id-attribute=data-testid
Elements with test IDs get priority in snapshots.

Snapshot to File

Save snapshots to files instead of returning inline:
browser_snapshot({
  filename: "page-state.md"
})
This reduces context usage when you don’t need the snapshot in the response.

Console and Network Logs

Capture additional context:
// Get console messages
browser_console_messages({
  level: "error",
  filename: "console-errors.txt"
})

// Get network requests
browser_network_requests({
  includeStatic: false,
  filename: "api-requests.txt"
})

Best Practices

Do

  • Use snapshots for all interactions
  • Store refs from latest snapshot
  • Use incremental mode for efficiency
  • Request new snapshot after navigation
  • Use descriptive element names

Don't

  • Don’t try to interact based on screenshots
  • Don’t reuse refs from old snapshots
  • Don’t parse snapshots manually
  • Don’t request snapshots unnecessarily
  • Don’t ignore accessibility attributes

Next Steps

Tool Reference

Explore all available browser automation tools

Browser Automation

Learn about browser automation concepts