Skip to main content

Agent Desktop

agent-desktop is a native desktop automation CLI designed for AI agents, built with Rust. It gives structured access to any application through OS accessibility trees — no screenshots, no pixel matching, no browser required.

Key Features

Native Rust CLI

Fast single binary with 50+ commands for desktop automation and no runtime dependencies

AI-Optimized Workflow

Snapshot & ref workflow with deterministic element references (@e1, @e2) for reliable automation

AX-First Interactions

Exhausts accessibility API strategies before falling back to mouse events

Structured JSON Output

Machine-readable responses with error codes and recovery hints

Works Everywhere

Control any macOS app through native accessibility trees (Windows/Linux planned)

Zero Dependencies

Self-contained binary under 15MB with no external requirements

Quick Example

# Get interactive elements with refs
agent-desktop snapshot --app Finder -i

# Click a button by ref
agent-desktop click @e3

# Type into a text field
agent-desktop type @e5 "quarterly report"

# Press keyboard shortcut
agent-desktop press cmd+s

# Re-observe after UI changes
agent-desktop snapshot -i

Core Workflow

The snapshot + ref pattern is optimal for LLMs: refs provide deterministic element selection without re-querying the accessibility tree.
Agent loop:  snapshot → decide → act → snapshot → decide → act → ...

Command Categories

Observation

Capture accessibility trees, screenshots, and search elements

Interaction

Click, type, select, toggle, and manipulate UI elements

Keyboard

Send key combos and control keyboard state

Mouse

Control cursor position, clicks, and drag operations

App & Window

Launch apps, manage windows, and control focus

Clipboard

Read and write clipboard contents

Wait

Block until conditions are met or time elapses

System

Check permissions, version, and system status

Batch

Execute multiple commands in sequence

Next Steps

Installation

Get agent-desktop installed on your system

Quickstart

Run your first automation in minutes

API Reference

Explore all 50+ commands in detail

Core Concepts

Learn the snapshot-ref workflow and JSON output

Platform Support

FeaturemacOSWindowsLinux
Accessibility treePlannedPlanned
Click / type / keyboardPlannedPlanned
Mouse inputPlannedPlanned
ScreenshotPlannedPlanned
ClipboardPlannedPlanned
App & window managementPlannedPlanned
Currently supports macOS 13.0+ with Rust 1.78+. Windows and Linux support is planned for future releases.

Build docs developers (and LLMs) love