Skip to main content
This guide walks you through the complete agent-native workflow using System Settings as a practical example.

Prerequisites

Before starting, ensure you’ve:
  1. Installed agent-native (installation guide)
  2. Granted Accessibility permissions to your terminal
  3. Verified installation with agent-native --version

Your first automation

We’ll automate a search in System Settings to demonstrate the core workflow.

Step 1: List running apps

First, check what apps are currently running:
agent-native apps
Finder
Safari
Terminal
System Settings
Mail
This shows all running GUI applications in alphabetical order.
For structured output (useful for agents):
agent-native apps --format json

Step 2: Open System Settings

Launch System Settings if it’s not already running:
agent-native open "System Settings"
The open command is idempotent - if the app is already running, it simply brings it to the front.

Step 3: Take a snapshot

Capture the interactive elements with persistent references:
agent-native snapshot "System Settings" -i
The -i flag filters to interactive elements only (buttons, text fields, toggles, etc.).
Snapshot: System Settings (pid 1234) -- 47 elements
---------------------------------------------
AXTextField (Search) = "" [AXFocus, AXCancel] [ref=n1]
AXButton "Apple Account" [AXPress] [ref=n2]
AXButton "Wi-Fi" [AXPress] [ref=n3]
AXButton "Bluetooth" [AXPress] [ref=n4]
AXButton "Network" [AXPress] [ref=n5]
  AXButton "General" [AXPress] [ref=n6]
  AXButton "Appearance" [AXPress] [ref=n7]
...
Each element gets a reference like @n1, @n2 that you can use to interact with it.
Use -c (compact) to remove empty structural elements: agent-native snapshot "System Settings" -i -c

Step 4: Interact with elements

Now use the references from the snapshot to interact:
From the snapshot output above, we saw the search field is @n1. Fill it with text:
agent-native fill @n1 "Wi-Fi"
The fill command clears the field first, then types the new text.

Step 5: Re-snapshot after UI changes

When you click a button or navigate to a new pane, the UI changes and your old references may no longer be valid. Take a new snapshot to get fresh references:
agent-native snapshot "System Settings" -i
References like @n5 are stored temporarily. After any significant UI change, re-run the snapshot command to get updated references for the new UI state.

Complete workflow example

Here’s a complete script that opens System Settings, searches for “Wi-Fi”, and clicks the result:
# 1. Open System Settings
agent-native open "System Settings"

# 2. Get interactive elements
agent-native snapshot "System Settings" -i

# From the snapshot, we identified:
# - @n1 is the search field
# - @n3 is the Wi-Fi button

# 3. Fill the search field
agent-native fill @n1 "Wi-Fi"

# 4. Wait a moment for search results
sleep 0.5

# 5. Take a new snapshot (UI has changed)
agent-native snapshot "System Settings" -i

# 6. Click the Wi-Fi button (assuming it's still @n3)
agent-native click @n3

Working with JSON output

For AI agents, use --json on any command to get structured output:
agent-native snapshot "System Settings" -i --json
JSON snapshot output structure:
[
  {
    "ref": "n1",
    "role": "AXTextField",
    "title": null,
    "label": "Search",
    "value": "",
    "enabled": true,
    "actions": ["AXFocus", "AXCancel"],
    "depth": 0
  },
  {
    "ref": "n2",
    "role": "AXButton",
    "title": "Apple Account",
    "label": null,
    "value": null,
    "enabled": true,
    "actions": ["AXPress"],
    "depth": 1
  }
]

Advanced options

Limit snapshot depth

Control how deep the tree traversal goes:
agent-native snapshot "System Settings" -i -d 3
The -d flag limits tree depth (default is 8).

Filter-based interaction (without snapshot)

You can interact with elements without taking a snapshot first:
# Click by title
agent-native click "System Settings" --title "Wi-Fi"

# Fill by label
agent-native fill "Safari" --label "Address" "https://github.com"

# Check by role and title
agent-native check "System Settings" --title "Wi-Fi" --role AXCheckBox
Filter-based interaction is less reliable than snapshot refs because it searches the tree each time. Use snapshot refs for multi-step workflows.

Wait for elements

Wait for an element to appear before proceeding:
agent-native wait "System Settings" --title "Apply" --timeout 5
Useful for waiting on dialogs or async UI updates.

Using with AI agents

Install the skill

For AI coding assistants like OpenCode or Cursor:
npx skills add ericclemmons/agent-native

Add to agent instructions

Add this to your AGENTS.md, CLAUDE.md, or similar:
## macOS App Automation

Use `agent-native` for controlling native macOS apps. Run `agent-native --help` for all commands.

Core workflow:
1. `agent-native open <app>` - Launch the app
2. `agent-native snapshot <app> -i` - Get interactive elements with refs
3. `agent-native click @n1` / `fill @n2 "text"` - Interact using refs
4. Re-snapshot after page/pane changes

Common patterns

agent-native snapshot "MyApp" -i
agent-native fill @n2 "John Doe"
agent-native fill @n3 "[email protected]"
agent-native check @n4  # Check a checkbox
agent-native click @n5  # Submit button
agent-native snapshot "MyApp" -i
agent-native select @n3 "Option A"
The select command opens the popup and clicks the matching menu item.
agent-native snapshot "MyApp" -i
agent-native inspect @n3
Shows all attributes and available actions for an element.

Next steps

Command reference

Complete reference for all agent-native commands

Examples

Real-world automation examples and patterns
Join the community and share your automations! Star the project on GitHub.

Build docs developers (and LLMs) love