Commands overview

agent-native provides a comprehensive set of commands for automating macOS applications through the Accessibility API. Commands are organized into logical categories for different automation tasks.

Command categories

Discovery commands

Find and explore application UI elements:

apps - List all running GUI applications
find - Search for elements matching specific criteria
inspect - View detailed element attributes and actions
tree - Display the accessibility hierarchy
snapshot - Create an interactive element reference map

Interaction commands

Perform actions on UI elements:

click - Click buttons and interactive elements
fill - Clear and fill text fields
type - Type text into elements
check / uncheck - Toggle checkboxes
select - Choose options from dropdowns
focus - Set keyboard focus
hover - Move cursor to element
action - Execute arbitrary accessibility actions

State commands

Read element properties and values:

get text - Extract text content
get value - Read input values
get attr - Query specific attributes
get title - Get window title
is enabled - Check if element is enabled
is focused - Check if element has focus

Wait commands

Pause execution until conditions are met:

wait - Wait for element to appear with timeout

Keyboard commands

Send keyboard input:

key - Send keystrokes and shortcuts
paste - Paste clipboard or file content

Screenshot commands

Capture visual output:

screenshot - Capture app window images

Common patterns

Using element references

Many commands support @ref syntax for targeting elements from a snapshot:

# Create snapshot with refs
agent-native snapshot Safari

# Use refs in commands
agent-native click @n42
agent-native fill @n15 "[email protected]"
agent-native get text @n8

Filter-based element selection

Commands accept filters to locate elements:

# By role
agent-native click Safari --role Button --title "Submit"

# By label
agent-native fill Safari --label "Email" "[email protected]"

# By identifier
agent-native click Safari --identifier "login-button"

# Multiple filters
agent-native find Safari --role TextField --label "Password"

JSON output

Most commands support --json flag for structured output:

agent-native apps --format json
agent-native click @n5 --json
agent-native get value @n10 --json

Command syntax

Target specification

Commands use two target types:

@ref

string

Element reference from snapshot (e.g., @n1, @n42)

app

string

Application name or bundle identifier (e.g., Safari, com.apple.Safari)

Common options

These options appear across multiple commands:

--role

string

Filter by accessibility role (e.g., Button, TextField, CheckBox)

--title

string

Filter by element title (substring match)

--label

string

Filter by accessibility label (substring match)

--identifier

string

Filter by accessibility identifier (substring match)

--index

integer

default:"0"

Which matching element to use (0-indexed)

--json

boolean

Output results as JSON

Next steps

Discovery commands

Learn how to find and explore UI elements

Interaction commands

Control apps by clicking and typing

State commands

Read element properties and values

Keyboard commands

Send keystrokes and shortcuts

Get Started

Core Concepts

Commands

AI Integration

Guides

Reference

Command categories

Discovery commands

Interaction commands

State commands

Wait commands

Keyboard commands

Screenshot commands

Common patterns

Using element references

Filter-based element selection

JSON output

Command syntax

Target specification

Common options

Next steps

Discovery commands

Interaction commands

State commands

Keyboard commands

Build docs developers (and LLMs) love

Get Started

Core Concepts

Commands

AI Integration

Guides

Reference

​Command categories

​Discovery commands

​Interaction commands

​State commands

​Wait commands

​Keyboard commands

​Screenshot commands

​Common patterns

​Using element references

​Filter-based element selection

​JSON output

​Command syntax

​Target specification

​Common options

​Next steps

Discovery commands

Interaction commands

State commands

Keyboard commands

Build docs developers (and LLMs) love

Command categories

Discovery commands

Interaction commands

State commands

Wait commands

Keyboard commands

Screenshot commands

Common patterns

Using element references

Filter-based element selection

JSON output

Command syntax

Target specification

Common options

Next steps