Skip to main content
agent-native provides a comprehensive set of commands for automating macOS applications through the Accessibility API. Commands are organized into logical categories for different automation tasks.

Command categories

Discovery commands

Find and explore application UI elements:
  • apps - List all running GUI applications
  • find - Search for elements matching specific criteria
  • inspect - View detailed element attributes and actions
  • tree - Display the accessibility hierarchy
  • snapshot - Create an interactive element reference map

Interaction commands

Perform actions on UI elements:
  • click - Click buttons and interactive elements
  • fill - Clear and fill text fields
  • type - Type text into elements
  • check / uncheck - Toggle checkboxes
  • select - Choose options from dropdowns
  • focus - Set keyboard focus
  • hover - Move cursor to element
  • action - Execute arbitrary accessibility actions

State commands

Read element properties and values:
  • get text - Extract text content
  • get value - Read input values
  • get attr - Query specific attributes
  • get title - Get window title
  • is enabled - Check if element is enabled
  • is focused - Check if element has focus

Wait commands

Pause execution until conditions are met:
  • wait - Wait for element to appear with timeout

Keyboard commands

Send keyboard input:
  • key - Send keystrokes and shortcuts
  • paste - Paste clipboard or file content

Screenshot commands

Capture visual output:
  • screenshot - Capture app window images

Common patterns

Using element references

Many commands support @ref syntax for targeting elements from a snapshot:
# Create snapshot with refs
agent-native snapshot Safari

# Use refs in commands
agent-native click @n42
agent-native fill @n15 "[email protected]"
agent-native get text @n8

Filter-based element selection

Commands accept filters to locate elements:
# By role
agent-native click Safari --role Button --title "Submit"

# By label
agent-native fill Safari --label "Email" "[email protected]"

# By identifier
agent-native click Safari --identifier "login-button"

# Multiple filters
agent-native find Safari --role TextField --label "Password"

JSON output

Most commands support --json flag for structured output:
agent-native apps --format json
agent-native click @n5 --json
agent-native get value @n10 --json

Command syntax

Target specification

Commands use two target types:
@ref
string
Element reference from snapshot (e.g., @n1, @n42)
app
string
Application name or bundle identifier (e.g., Safari, com.apple.Safari)

Common options

These options appear across multiple commands:
--role
string
Filter by accessibility role (e.g., Button, TextField, CheckBox)
--title
string
Filter by element title (substring match)
--label
string
Filter by accessibility label (substring match)
--identifier
string
Filter by accessibility identifier (substring match)
--index
integer
default:"0"
Which matching element to use (0-indexed)
--json
boolean
Output results as JSON

Next steps

Discovery commands

Learn how to find and explore UI elements

Interaction commands

Control apps by clicking and typing

State commands

Read element properties and values

Keyboard commands

Send keystrokes and shortcuts

Build docs developers (and LLMs) love