Skip to main content

Overview

Clicks an element using the AX-first activation strategy. The command exhausts pure accessibility API methods before falling back to mouse events, ensuring maximum compatibility across applications.

Syntax

agent-desktop click <ref>

Parameters

ref
string
required
Element reference from snapshot (@e1, @e2, etc.)

Response

{
  "version": "1.0",
  "ok": true,
  "command": "click",
  "data": {
    "action": "click",
    "ref_id": "@e3"
  }
}

Response Fields

action
string
The action performed (click)
ref_id
string
The element reference that was clicked
post_state
object
Element state after the action
  • role (string): Element role
  • states (string[]): Current element states
  • value (string): Current element value

AX-First Strategy

The click command follows a 15-step activation chain:
  1. Try kAXPressAction on the element
  2. Try focus + return key
  3. Try clicking parent elements
  4. Fall back to mouse click at element center
This ensures buttons, links, menu items, and other interactive elements respond correctly regardless of their implementation.

Usage Examples

Click a Button

agent-desktop snapshot --app "System Settings" -i
agent-desktop click @e5

Click and Chain Actions

agent-desktop click @e2
agent-desktop wait 500
agent-desktop snapshot -i

Batch Click

agent-desktop batch '[
  {"command": "click", "args": {"ref_id": "@e3"}},
  {"command": "wait", "args": {"ms": 200}},
  {"command": "click", "args": {"ref_id": "@e7"}}
]'

Error Cases

Error CodeCauseRecovery
ELEMENT_NOT_FOUNDRef doesn’t exist in current refmapRun snapshot to refresh
STALE_REFElement no longer matches saved refRun snapshot and use new ref
ACTION_FAILEDElement doesn’t support press actionTry set-value or coordinate-based click
PERM_DENIEDAccessibility permission not grantedGrant permission in System Settings

Notes

  • Click is idempotent for most buttons but may trigger side effects
  • For form submissions, prefer clicking the submit button over pressing return
  • Use double-click for file selection, triple-click for text selection
  • Refs are snapshot-scoped; always refresh after UI changes

Build docs developers (and LLMs) love