Kortix agents can control a real browser using natural language commands. This enables agents to interact with any website just like a human would - clicking buttons, filling forms, scrolling pages, and extracting structured data.The browser automation capability is powered by Stagehand, running in a sandboxed environment with full visual feedback through screenshots.
browser_act(action="click the login button")browser_act(action="fill in email with [email protected]")browser_act(action="scroll down")browser_act(action="select 'Premium' from the dropdown")
Performs any browser action using natural language descriptions:
# Navigate to login pagebrowser_navigate_to(url="https://app.example.com/login")# Fill in credentialsbrowser_act(action="click the email field")browser_act(action="type [email protected]")browser_act(action="click the password field")browser_act( action="type %password%", variables={"password": "secure_pass"})# Submit formbrowser_act(action="click the Sign In button")
# Navigate to product pagebrowser_navigate_to(url="https://shop.example.com/products")# Scroll to load all productsbrowser_act(action="scroll to bottom")# Extract product dataresult = browser_extract_content( instruction="extract all products with name, price, and rating")
# Research a company websitebrowser_navigate_to(url="https://example.io")# Browse key pagesbrowser_act(action="click the Features link")features = browser_extract_content(instruction="extract feature descriptions")browser_act(action="click Pricing")pricing = browser_extract_content(instruction="get pricing tiers and costs")browser_act(action="click About Us")company_info = browser_extract_content(instruction="extract company mission and team size")
@tool_metadata( display_name="Browser", description="Interact with web pages using mouse and keyboard, take screenshots, and extract content", icon="Globe", color="bg-cyan-100 dark:bg-cyan-800/50")class BrowserTool(SandboxToolsBase): """ Browser Tool for browser automation using local Stagehand API. Only 4 core functions that can handle everything: - browser_navigate_to: Navigate to URLs - browser_act: Perform any action (click, type, scroll, dropdowns etc.) - browser_extract_content: Extract content from pages - browser_screenshot: Take screenshots """
Once content is extracted, use it as the primary source:
# Extract content onceproduct_data = browser_extract_content( instruction="get product information")# ✅ Use extracted data for deliverables# ❌ Don't override with web search results