Overview
agent-desktop is built as a Rust workspace with a strict separation between platform-agnostic core logic and platform-specific adapters. This architecture enables cross-platform support while maintaining a clean, testable codebase.The architecture follows dependency inversion: core defines interfaces, platforms implement them. Core never imports platform crates.
Workspace Structure
Core Crate (agent-desktop-core)
The core crate contains:
PlatformAdaptertrait: 12-method interface that all platforms implement- Shared types:
AccessibilityNode,Action,WindowInfo,RefEntry, error types - Command handlers: Each command has an
execute()function incommands/ - Ref system: Deterministic element reference allocation (
@e1,@e2, …)
Key Principles
Zero Platform Imports
Core never imports
macos, windows, or linux crates. CI enforces this with cargo tree -p agent-desktop-core.Trait-Based
All platform operations go through the
PlatformAdapter trait. Default implementations return not_supported().One Command Per File
Each CLI command lives in its own file under
commands/. Max 400 LOC per file.Structured Errors
Every error includes a code, message, suggestion, and optional platform detail.
Platform Crates
Folder Structure
All platform crates (macos, windows, linux) follow an identical layout:
macOS Implementation (Phase 1)
The macOS adapter uses native accessibility APIs:- Tree traversal:
AXUIElementCreateApplication(pid)+kAXChildrenAttributerecursion - Batch fetching:
AXUIElementCopyMultipleAttributeValuesfor 3-5x speed boost - Action execution:
AXUIElementPerformActionwith 15-step AX-first fallback chain - Input synthesis:
CGEventCreateKeyboardEvent/CGEventCreateMouseEvent - Clipboard:
NSPasteboard.generalPasteboardvia Cocoa FFI - Screenshot:
CGWindowListCreateImage
Binary Crate (src/)
The binary is the only place that wires platform → core:
Command Dispatch
Simplematch statement, no trait dispatch:
PlatformAdapter Trait
The trait defines 12 core methods:Tree & Window Operations
Tree & Window Operations
Element Actions
Element Actions
App Lifecycle
App Lifecycle
System Integration
System Integration
Err(AdapterError::not_supported()).
Key Types
| Type | Purpose | Fields |
|---|---|---|
AccessibilityNode | Platform-agnostic tree node | ref, role, name, value, description, states, bounds, children |
Action | Element interaction | Click, SetValue(String), SetFocus, Expand, Toggle, Scroll(Direction, Amount), PressKey(KeyCombo) |
NativeHandle | Opaque platform pointer | PhantomData<*const ()> to prevent auto-Send/Sync |
RefEntry | Ref storage record | pid, role, name, bounds_hash, available_actions |
WindowInfo | Window metadata | id, title, app_name, pid, bounds |
ErrorCode | Machine-readable error | PERM_DENIED, ELEMENT_NOT_FOUND, STALE_REF, etc. |
Ref System
Snapshot assigns refs
Interactive elements receive sequential refs in depth-first order:
@e1, @e2, @e3, …Actions resolve refs
Commands like
click @e3 use optimistic re-identification: (pid, role, name, bounds_hash).button, textfield, checkbox, link, menuitem, tab, slider, combobox, treeitem, cell, radiobutton, incrementor, menubutton, switch, colorwell, dockitem
Static elements (labels, groups, containers) appear in the tree for context but have no ref.
Phase Model
agent-desktop follows an additive phase model:Phases 2-4 add adapters/transports/hardening. Nothing in core is rebuilt.
Dependencies
Core Dependencies (all platforms)
Platform-Specific (macOS)
Target-Gated
Binary crate uses[target.'cfg(target_os = "macos")'.dependencies] syntax to conditionally include platform crates.
Build Configuration
Optimized for small binary size (under 15MB target):Next Steps
Development
Learn about build commands, testing, and contributing
Troubleshooting
Common issues and solutions