Skip to main content
Coordinate-based tools require the vision capability. Start the server with --caps=vision to enable these tools.
Coordinate-based tools provide low-level mouse control using x/y coordinates. These tools are useful for interacting with elements that don’t have proper accessibility attributes or for pixel-perfect interactions like drawing applications.

Enabling Vision Capability

To use coordinate-based tools, start the MCP server with the vision capability:
npx @playwright/mcp-server --caps=vision
You can combine it with other capabilities:
npx @playwright/mcp-server --caps=vision,pdf,testing

Mouse Click Operations

browser_mouse_click_xy

Click left mouse button at a given position.
x
number
required
X coordinate on the page
y
number
required
Y coordinate on the page
Read-only: No
// Example: Click at position (100, 200)
{
  "tool": "browser_mouse_click_xy",
  "arguments": {
    "x": 100,
    "y": 200
  }
}

browser_mouse_down

Press mouse button down at current position.
button
string
Button to press, defaults to “left”. Options: “left”, “right”, “middle”
Read-only: No
// Example: Press right mouse button
{
  "tool": "browser_mouse_down",
  "arguments": {
    "button": "right"
  }
}

browser_mouse_up

Release mouse button at current position.
button
string
Button to release, defaults to “left”. Options: “left”, “right”, “middle”
Read-only: No
// Example: Release left mouse button
{
  "tool": "browser_mouse_up",
  "arguments": {
    "button": "left"
  }
}

Mouse Movement

browser_mouse_move_xy

Move mouse to a given position.
x
number
required
X coordinate to move to
y
number
required
Y coordinate to move to
Read-only: No
// Example: Move mouse to position (300, 400)
{
  "tool": "browser_mouse_move_xy",
  "arguments": {
    "x": 300,
    "y": 400
  }
}

browser_mouse_drag_xy

Drag left mouse button from one position to another.
startX
number
required
Start X coordinate
startY
number
required
Start Y coordinate
endX
number
required
End X coordinate
endY
number
required
End Y coordinate
Read-only: No
// Example: Drag from (100, 100) to (300, 300)
{
  "tool": "browser_mouse_drag_xy",
  "arguments": {
    "startX": 100,
    "startY": 100,
    "endX": 300,
    "endY": 300
  }
}

Scrolling

browser_mouse_wheel

Scroll using the mouse wheel.
deltaX
number
required
Horizontal scroll amount (positive = right, negative = left)
deltaY
number
required
Vertical scroll amount (positive = down, negative = up)
Read-only: No
// Example: Scroll down by 100 pixels
{
  "tool": "browser_mouse_wheel",
  "arguments": {
    "deltaX": 0,
    "deltaY": 100
  }
}

Use Cases

Drawing Applications

Coordinate-based tools are ideal for interacting with canvas-based drawing applications:
// Draw a line on a canvas
{ "tool": "browser_mouse_move_xy", "arguments": { "x": 50, "y": 50 } }
{ "tool": "browser_mouse_down", "arguments": { "button": "left" } }
{ "tool": "browser_mouse_move_xy", "arguments": { "x": 200, "y": 200 } }
{ "tool": "browser_mouse_up", "arguments": { "button": "left" } }

Interactive Maps

Interact with map interfaces that use coordinate-based interactions:
// Pan the map
{
  "tool": "browser_mouse_drag_xy",
  "arguments": {
    "startX": 400,
    "startY": 300,
    "endX": 200,
    "endY": 100
  }
}

Games and Simulations

Control game interfaces or simulations that require precise coordinate input:
// Click a specific location in a game
{
  "tool": "browser_mouse_click_xy",
  "arguments": {
    "x": 640,
    "y": 360
  }
}

Elements Without Accessibility Attributes

When elements lack proper accessibility attributes, coordinate-based clicks can serve as a fallback:
// Click an unlabeled button at known coordinates
{
  "tool": "browser_mouse_click_xy",
  "arguments": {
    "x": 150,
    "y": 75
  }
}
Prefer semantic element-based tools (like browser_click) when possible. Use coordinate-based tools only when necessary, as they are more fragile and can break with layout changes.

Best Practices

Coordinate-based interactions are fragile and can break with layout changes. Always prefer semantic element selection using browser_click, browser_type, and other element-based tools when available.
Use browser_take_screenshot to verify element positions before using coordinate-based tools. This helps ensure you’re clicking the right location.
Coordinates are relative to the viewport. Ensure the browser window size is consistent with your expected coordinates using browser_resize.
Coordinates are relative to the current viewport. If the page is scrolled, the coordinates will be different. Use browser_mouse_wheel to scroll if needed.

Combining with Other Tools

Coordinate-based tools work well with other Playwright MCP tools:
// 1. Resize browser for consistent coordinates
{ "tool": "browser_resize", "arguments": { "width": 1280, "height": 720 } }

// 2. Take a screenshot to verify positions
{ "tool": "browser_take_screenshot", "arguments": { "filename": "before-click.png" } }

// 3. Perform coordinate-based click
{ "tool": "browser_mouse_click_xy", "arguments": { "x": 640, "y": 360 } }

// 4. Take another screenshot to verify result
{ "tool": "browser_take_screenshot", "arguments": { "filename": "after-click.png" } }

Core Automation

Prefer semantic element-based interactions

Browser Resize

Set consistent viewport for coordinate accuracy