Skip to main content
PinchTab

What is PinchTab?

PinchTab is a standalone HTTP server that gives AI agents direct control over a Chrome browser. It’s a 12MB Go binary that provides token-efficient browser automation via HTTP API or CLI.

Token-Efficient

800 tokens/page with text extraction - 5-13x cheaper than screenshots

Multi-Instance

Run multiple parallel Chrome processes with isolated profiles

Self-Contained

12MB binary, no external dependencies required

Accessibility-First

Stable element refs instead of fragile coordinates

Key Features

Control via command-line or HTTP API. Use curl, Python, Node.js, or any HTTP client.
# CLI
pinchtab instance launch

# HTTP API
curl -X POST http://localhost:9867/instances/launch
Run without a window (headless) or with visible Chrome (headed) for debugging.
# Headless (default, faster)
pinchtab instance launch --mode headless

# Headed (visible window)
pinchtab instance launch --mode headed
Browser profiles preserve cookies, sessions, and auth across instance restarts. Log in once, stay logged in.
# Create persistent profile
pinchtab profile create work

# Launch instance with profile
curl -X POST http://localhost:9867/instances/start \
  -d '{"profileId":"prof_278be873"}'
Run multiple instances in parallel with complete isolation - no shared state, no cookie leakage.
# Create 3 independent instances
for i in 1 2 3; do
  pinchtab instance launch --mode headless
done

# Each instance gets auto-allocated port: 9868, 9869, 9870

Why PinchTab?

For AI Agents

Designed specifically for AI-driven browser automation with token-efficient snapshots and stable element references.

For Developers

Simple HTTP API, CLI tools, and comprehensive documentation. Start automating in 5 minutes.

For Scale

Multi-instance architecture with auto-allocated ports (9868-9968). Run hundreds of parallel browsers.

For Security

Profile isolation, stealth mode, and secure token authentication for remote access.

How It Works

PinchTab consists of four core entities working together:
PinchTab Orchestrator (HTTP server on port 9867)

  ├── Instance 1 (inst_0a89a5bb, port 9868, temp profile)
  │     ├── Tab 1 (tab_xyz123, https://example.com)
  │     ├── Tab 2 (tab_xyz124, https://google.com)
  │     └── Tab 3 (tab_xyz125, https://github.com)

  ├── Instance 2 (inst_1b9a5dcc, port 9869, profile: work)
  │     ├── Tab 1 (tab_abc001, internal dashboard, logged in)
  │     └── Tab 2 (tab_abc002, internal docs)

  └── Instance 3 (inst_2c8a5eef, port 9870, profile: personal)
        ├── Tab 1 (tab_def001, gmail.com)
        └── Tab 2 (tab_def002, bank.com)
The orchestrator manages all instances and routes requests. Each instance is a separate Chrome process with optional profile for persistent state. Each tab is a webpage you navigate and interact with.

Quick Example

Here’s a complete browser automation workflow in just a few commands:
1

Start the orchestrator

pinchtab
Expected output:
🦀 Pinchtab Dashboard port=9867
dashboard ready url=http://localhost:9867/dashboard
2

Create a Chrome instance

# Create headless instance
INST=$(pinchtab instance launch --mode headless | jq -r '.id')
echo "Instance: $INST"

# Wait for Chrome to initialize
sleep 2
3

Navigate to a website

# Create tab and navigate
TAB_ID=$(curl -s -X POST http://localhost:9867/instances/$INST/tabs/open \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com"}' | jq -r '.id')
4

Get page content

# Get page structure with interactive elements
curl http://localhost:9867/tabs/$TAB_ID/snapshot | jq '.nodes | map({ref, role, name})'

# Extract text content
curl http://localhost:9867/tabs/$TAB_ID/text | jq '.text'
5

Interact with the page

# Click a button (ref from snapshot)
curl -X POST http://localhost:9867/tabs/$TAB_ID/action \
  -H "Content-Type: application/json" \
  -d '{"kind":"click","ref":"e5"}'

# Fill an input field
curl -X POST http://localhost:9867/tabs/$TAB_ID/action \
  -H "Content-Type: application/json" \
  -d '{"kind":"fill","ref":"e3","text":"[email protected]"}'
Check out the Quick Start guide to get PinchTab running in 5 minutes, or explore Core Concepts to understand the mental model.

Use Cases

AI Agent Automation

Build AI agents that can browse, click, fill forms, and extract data from any website.

Web Scraping

Extract text efficiently (~800 tokens/page) instead of expensive screenshots (10,000+ tokens).

End-to-End Testing

Automate browser testing with stable element references and multi-instance parallelization.

Multi-Account Management

Manage multiple user sessions with isolated profiles and persistent authentication.

Comparison

FeaturePinchTabPuppeteerPlaywrightSelenium
Token efficiency✅ 800 tokens/page❌ 10k+ tokens❌ 10k+ tokens❌ 10k+ tokens
HTTP API✅ Native❌ Library only❌ Library only❌ Library only
Multi-instance✅ Auto-managed⚠️ Manual⚠️ Manual⚠️ Manual
Persistent profiles✅ Built-in⚠️ Manual⚠️ Manual⚠️ Manual
Binary size✅ 12MB❌ Node deps❌ Node deps❌ Java deps
Stealth mode✅ Built-in⚠️ Plugins✅ Built-in❌ Not available
Element stability✅ Accessibility refs❌ Selectors❌ Selectors❌ XPath
PinchTab is specifically designed for AI agents and token efficiency, making it ideal for LLM-driven automation workflows.

Next Steps

Quick Start

Get PinchTab running in 5 minutes with installation and first commands

Core Concepts

Understand orchestrator, instances, profiles, and tabs

API Reference

Complete HTTP API documentation with all endpoints

Examples

Real-world examples and code snippets

Community & Support

GitHub

Star the repo, report issues, contribute code

Discussions

Ask questions, share use cases, get help

Build docs developers (and LLMs) love