Model Context Protocol Overview

What is MCP?

The Model Context Protocol (MCP) is a standardized protocol that enables Large Language Models (LLMs) to interact with external systems through a structured client-server architecture. MCP provides a consistent way for AI agents to access tools, resources, and capabilities beyond their training data. Playwright MCP is a server implementation that exposes browser automation capabilities through the MCP protocol, allowing LLMs to interact with web pages using structured data rather than visual interpretation.

Client-Server Architecture

Playwright MCP follows the standard MCP architecture pattern:

MCP Client

The AI agent or IDE that sends requests to the server. Examples include Claude Desktop, VS Code, Cursor, and other MCP-compatible tools.

MCP Server

The Playwright MCP server that receives requests, executes browser automation, and returns structured responses.

Communication Flow

Transport Protocols

Playwright MCP supports multiple transport mechanisms: STDIO Transport (Default) The most common mode, where the server communicates with the client through standard input/output streams:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

SSE Transport (HTTP Mode) For remote deployments or when running the server separately:

npx @playwright/mcp@latest --port 8931

{
  "mcpServers": {
    "playwright": {
      "url": "http://localhost:8931/mcp"
    }
  }
}

SSE (Server-Sent Events) transport is useful when running headed browsers on systems without displays or when the IDE worker process can’t access the display.

Tool Capabilities System

Playwright MCP uses a capability-based system to control which tools are available. This allows you to enable only the features you need:

Core Capabilities

Core (Default)

Page navigation and snapshots
Element interactions (click, type, hover)
Form handling
Dialog handling
Keyboard and mouse actions

PDF

PDF generation from web pages
Requires --caps=pdf flag

Vision

Coordinate-based interactions
XY position clicking and dragging
Requires --caps=vision flag

DevTools

Browser developer tools access
Advanced debugging capabilities
Requires --caps=devtools flag

Enabling Capabilities

Capabilities can be enabled via command-line arguments:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "@playwright/mcp@latest",
        "--caps=vision,pdf"
      ]
    }
  }
}

Or through configuration:

{
  "capabilities": ["core", "pdf", "vision"]
}

By default, only core capabilities are enabled. This keeps the tool list minimal and reduces context window usage.

Configuration Options

Playwright MCP can be configured through:

Command-line arguments - Quick configuration via --option=value
Environment variables - System-level settings via PLAYWRIGHT_MCP_*
Configuration file - Comprehensive JSON configuration via --config

Configuration File

For complex setups, use a JSON configuration file:

{
  browser?: {
    browserName?: 'chromium' | 'firefox' | 'webkit';
    isolated?: boolean;
    userDataDir?: string;
    launchOptions?: playwright.LaunchOptions;
    contextOptions?: playwright.BrowserContextOptions;
    cdpEndpoint?: string;
    initPage?: string[];
    initScript?: string[];
  },
  
  server?: {
    port?: number;
    host?: string;
    allowedHosts?: string[];
  },
  
  capabilities?: ToolCapability[];
  
  timeouts?: {
    action?: number;  // Default: 5000ms
    navigation?: number;  // Default: 60000ms
  };
  
  snapshot?: {
    mode?: 'incremental' | 'full' | 'none';
  };
}

Load your configuration:

npx @playwright/mcp@latest --config path/to/config.json

Playwright MCP vs Playwright CLI

Playwright offers two approaches for AI-driven automation:

Playwright CLI + SKILLS

Best for: Coding agents working with large codebases

More token-efficient
Concise, purpose-built commands
Avoids loading large tool schemas
Better for high-throughput agents

Playwright MCP

Best for: Exploratory automation and persistent workflows

Rich page introspection
Persistent browser state
Structured accessibility data
Long-running autonomous workflows

Choose CLI+SKILLS for coding agents that need to balance browser automation with code editing. Choose MCP for agents focused primarily on web interaction and exploration.

Key Features

Accessibility-First Approach

Playwright MCP uses the browser’s accessibility tree instead of screenshots:

Faster - No image processing required
Smaller - Structured text data vs large image files
Deterministic - Precise element references vs coordinate guessing
LLM-friendly - No vision models needed

Session Management

The server maintains browser state across multiple tool calls:

Cookies and local storage persist
Login sessions remain active
Tab state preserved between operations
Optional session recording and trace saving

Security Boundaries

Playwright MCP is not a security boundary. Follow MCP Security Best Practices for production deployments.

Security features include:

Origin allowlists and blocklists
File access restrictions
Service worker blocking
Secrets management

Next Steps

Browser Automation

Learn how Playwright manages browser contexts and sessions

Accessibility Snapshots

Understand the structured snapshot format

Get Started

Core Concepts

Guides

Tools Reference

Model Context Protocol Overview

What is MCP?

Client-Server Architecture

MCP Client

MCP Server

Communication Flow

Transport Protocols

Tool Capabilities System

Core Capabilities

Enabling Capabilities

Configuration Options

Configuration File

Playwright MCP vs Playwright CLI

Playwright CLI + SKILLS

Playwright MCP

Key Features

Accessibility-First Approach

Session Management

Security Boundaries

Next Steps

Browser Automation

Accessibility Snapshots

Get Started

Core Concepts

Guides

Tools Reference

​What is MCP?

​Client-Server Architecture

MCP Client

MCP Server

​Communication Flow

​Transport Protocols

​Tool Capabilities System

​Core Capabilities

​Enabling Capabilities

​Configuration Options

​Configuration File

​Playwright MCP vs Playwright CLI

Playwright CLI + SKILLS

Playwright MCP

​Key Features

​Accessibility-First Approach

​Session Management

​Security Boundaries

​Next Steps

Browser Automation

Accessibility Snapshots

What is MCP?

Client-Server Architecture

Communication Flow

Transport Protocols

Tool Capabilities System

Core Capabilities

Enabling Capabilities

Configuration Options

Configuration File

Playwright MCP vs Playwright CLI

Key Features

Accessibility-First Approach

Session Management

Security Boundaries

Next Steps