Hermes can control a web browser to navigate pages, fill forms, click buttons, and extract information. Combined with Browserbase's stealth features, it can access sites that block typical automation.
What Browser Automation Enables
- Research: Browse documentation, search results, API references
- Data extraction: Scrape product prices, job listings, social media
- Form filling: Submit applications, configure settings, upload files
- Testing: Verify web applications, check deployments
- Authentication: Log into services, manage accounts
How It Works
User: "Check the current Bitcoin price on CoinGecko"
Hermes:
1. browser_navigate → coingecko.com
2. browser_snapshot → captures screenshot
3. vision_analyze → extracts price from image
4. Returns: "Bitcoin is currently $67,423"
The agent sees the page through screenshots and uses vision models to understand content.
Setting Up Browserbase
Browserbase provides cloud browsers with anti-detection features.
1. Get credentials:
# Sign up at browserbase.com
# Get API key and project ID from dashboard
2. Add to .env:
BROWSERBASE_API_KEY=bb_live_xxx
BROWSERBASE_PROJECT_ID=proj_xxx
3. Enable stealth features:
BROWSERBASE_PROXIES=true # Residential IPs
BROWSERBASE_ADVANCED_STEALTH=true # Scale plan only
Browser Tools
| Tool | Purpose |
|---|---|
browser_navigate |
Go to URL |
browser_snapshot |
Take screenshot |
browser_click |
Click element |
browser_type |
Enter text |
browser_scroll |
Scroll page |
browser_back |
Go back |
browser_press |
Press key |
browser_get_images |
List images |
browser_vision |
Analyze with vision |
Stealth Levels
Basic Stealth (Always On):
- Random browser fingerprints
- Automatic CAPTCHA solving
- Human-like interaction timing
Residential Proxies (+):
- Traffic through real residential IPs
- Dramatically improves success rate
- Set
BROWSERBASE_PROXIES=true
Advanced Stealth (Scale Plan):
- Custom Chromium build
- Avoids all bot detection
- Set
BROWSERBASE_ADVANCED_STEALTH=true
Example: Scraping with Vision
User: "Go to Hacker News and list the top 5 posts"
Hermes executes:
1. browser_navigate("https://news.ycombinator.com")
2. browser_snapshot()
3. vision_analyze("List the top 5 post titles and their point counts")
Response:
1. Show HN: My open source project (342 points)
2. The future of AI agents (289 points)
...
Session Management
Browser sessions auto-close after inactivity:
BROWSER_SESSION_TIMEOUT=300 # 5 minutes
BROWSER_INACTIVITY_TIMEOUT=120 # 2 minutes
Tips for Reliable Automation
- Use vision for extraction — More reliable than DOM selectors
- Add waits after navigation — Let pages fully load
- Enable residential proxies — Most sites block datacenter IPs
- Handle CAPTCHAs gracefully — Browserbase auto-solves many
Limitations
- No JavaScript execution — Can't run custom scripts in browser
- Screenshot-based vision — Slower than DOM parsing
- Browserbase costs — Cloud browsers have usage fees