Overview
TheVimbot class provides a high-level interface for autonomous web browsing using Playwright with the Vimium extension. It handles browser initialization, navigation, keyboard interactions, and screenshot capture for vision-based automation.
Class: Vimbot
Constructor
Whether to run the browser in headless mode. Set to
True for running without a visible browser window.- Launches a persistent Chromium context with the Vimium extension loaded from
./vimium-master - Creates a new page with viewport size of 1080x720
- Ignores HTTPS errors for flexibility in browsing
Methods
navigate()
The URL to navigate to. If the URL doesn’t contain
://, it will automatically prepend https://.type()
The text to type into the active input field.
- Waits 1 second before typing
- Types the text character by character
- Automatically presses Enter after typing
click()
The Vimium hint characters (1-2 letter sequence from yellow boxes) to click on.
capture()
A PIL Image object in RGB format containing the screenshot with Vimium hints displayed.
- Presses Escape to ensure clean state
- Types “f” to activate Vimium’s hint mode (shows yellow boxes with letter sequences)
- Takes and returns a screenshot as a PIL Image
perform_action()
Returns
True if the action contains a “done” key, indicating task completion. Otherwise returns None.{"done": True}- Signals completion{"navigate": "url"}- Navigates to URL{"type": "text"}- Types text{"click": "ab"}- Clicks element{"click": "ab", "type": "text"}- Clicks element then types text
Configuration
Vimium path
The Vimbot class expects the Vimium extension to be located at./vimium-master relative to the working directory. Ensure you have downloaded and extracted the Vimium extension to this location.
Browser settings
- Viewport size: 1080x720 pixels
- Browser: Chromium (via Playwright)
- Extensions: Vimium for keyboard navigation
- HTTPS errors: Ignored for flexibility
- Timeout: 60 seconds for navigation