Browser Automation
OrcBot’s browser automation system provides production-ready web scraping and interaction capabilities with built-in anti-bot detection, CAPTCHA solving, and vision-based navigation.Architecture
Browser Engines
OrcBot supports three browser engines:- Puppeteer (default) - Fast, reliable Chromium automation
- Playwright - Advanced Chromium with better stealth
- Lightpanda - Lightweight headless browser (experimental)
switch_browser_engine.
Stealth Features
All browser operations include:--disable-blink-features=AutomationControlledflag removal- Realistic user agent strings (desktop or mobile)
- Persistent profiles with cookies and localStorage
- Resource blocking (ads, trackers, analytics)
- Console log capture for debugging
- Blank page detection and recovery
Persistent Profiles
Browser profiles are stored in~/.orcbot/browser-profiles/ (Playwright) or ~/.orcbot/puppeteer-profiles/ (Puppeteer). Profiles persist:
- Cookies and session storage
- localStorage data
- Browser history
- Cached resources
switch_browser_profile to manage multiple identities.
Core Navigation Skills
browser_navigate
Navigate to a URL and extract semantic snapshot. See Web Search for full documentation.browser_examine_page
Get a semantic snapshot of the current page without navigating. Parameters: None. Return Value: Semantic snapshot identical tobrowser_navigate output, but for the current page.
Example Usage:
- isDeep:
true - isResearch:
false
browser_back
Navigate back to the previous page in browser history. Parameters: None. Return Value: Confirmation message and new page URL. Example Usage:Interaction Skills
browser_click
Click an element by CSS selector or reference number. Parameters:CSS selector (e.g.,
"#submit-btn") or reference number from snapshot (e.g., 3)browser_type
Type text into an input field. Parameters:CSS selector or reference number for the input element
Text to type
Use slow typing (100ms delay per character) to avoid bot detection
browser_press
Press a keyboard key or key combination. Parameters:Key name or combination (e.g.,
"Enter", "Control+C", "Alt+Tab")- Single keys:
Enter,Escape,Tab,Backspace,Delete,ArrowUp, etc. - Combinations:
Control+C,Meta+V,Shift+Enter
browser_hover
Hover over an element to trigger menus or tooltips. Parameters:CSS selector for the element to hover
browser_select
Select an option in a dropdown by visible label. Parameters:CSS selector for the
<select> elementVisible label text of the option to select
browser_scroll
Scroll the page up or down. Parameters:"up" or "down"Pixels to scroll
Waiting & Timing
browser_wait
Wait for a specified duration. Parameters:Milliseconds to wait
browser_wait_for
Wait for an element to appear on the page. Parameters:CSS selector to wait for
Timeout in milliseconds
Visual & Analysis Skills
browser_screenshot
Capture a screenshot of the current page. Parameters:Capture entire page or just viewport
browser_vision
Analyze the current page using AI vision (GPT-4 Vision or Gemini). Parameters:Optional question or instruction. Defaults to “Describe what you see on this page.”
browser_solve_captcha
Attempt to solve a detected CAPTCHA automatically. Parameters: None. RequirescaptchaApiKey (2captcha) in config.
Return Value:
Success or error message.
Example Usage:
CAPTCHA solving requires an API key from 2captcha.com and can take 20-60 seconds.
Advanced Skills
browser_run_js
Execute custom JavaScript on the current page. Parameters:JavaScript code to execute. Returns the result of the expression.
browser_run_script
Execute custom Puppeteer/Playwright code with access topage and browser objects. (Admin only)
Parameters:
JavaScript code with access to
page and browser variablesProfile & Engine Management
switch_browser_profile
Switch to a different persistent browser profile. Parameters:Name of the profile to switch to (creates if doesn’t exist)
Optional custom directory for profiles
- Manage multiple logged-in sessions
- Isolate scraping tasks
- Test with different browser states
- Bypass rate limits (different cookies/fingerprints)
switch_browser_engine
Switch between Puppeteer, Playwright, and Lightpanda. Parameters:"puppeteer", "playwright", or "lightpanda"For Lightpanda: CDP endpoint (default:
ws://127.0.0.1:9222)Mobile Viewport
Switch between desktop and mobile viewports:- Size: 375x812 (iPhone 13)
- User agent: iOS Safari 16.6
- Device scale factor: 2x
- Touch events enabled
State Tracking
The browser maintains state across skills:- Last navigated URL: Tracks the current page
- Blank page counter: Auto-recovers from blank pages (max 3 attempts)
- Console logs: Captures JavaScript errors and warnings
- Intercepted APIs: Records XHR/fetch requests (when enabled)
Circuit Breaker Pattern
The browser implements automatic loop prevention:- Tracks consecutive blank page loads per domain
- After 3 blank pages from the same domain, switches to headful mode
- Clears counter after successful navigation
Best Practices
Troubleshooting
”Element not found”
- Cause: Selector doesn’t match or element hasn’t loaded yet
- Fix: Use
browser_examine_pageto verify the selector, thenbrowser_wait_forbefore clicking
”Blank page detected”
- Cause: Site blocked automation or JavaScript failed to render
- Fix: Retry with
headless: falseor switch to a fresh profile
”CAPTCHA blocked navigation”
- Cause: Site detected automation
- Fix: Configure
captchaApiKeyor useswitch_browser_profileto a clean profile
”Browser crashed”
- Cause: Out of memory or GPU issues
- Fix: Check browser args in config, disable extensions, or restart OrcBot