Agent-Browser-Bridge-AI

Other

Agent-Browser-Bridge-AI — Anti-detection browser control for AI agents. DOM-first, human-like interactions (Bezier), lead gen, extraction, MCP.

Install

openclaw skills install @alexandre-leng/agentbridge

AgentBridge 🌉 — Anti-Detection Browser Control for AI Agents

The browser that doesn't look like a bot. Full anti-detection browser bridge for AI agents. DOM-first interactions, human-like mouse movements (Bezier curves), typing jitter, stealth anti-fingerprinting, and MCP-native (15 tool + 1 raw tool) integration.


📋 Complete Feature Reference

1. 🕹️ Browser Control

Available via CLI command or MCP tool.

CLI commandMCP toolAction
navigate <url>navigateGo to any URL with polite waiting
annotateannotate_pageScreenshot + numbered DOM element tree with refs
click <ref>click_refClick element by numbered ref (found in annotate)
type <ref> <text>type_refType text into element (clears field first)
press <key>via rawPress keyboard key (Enter/Tab/Escape/...)
scroll [dir] [amount]via rawScroll viewport (down/up, default 300px)
discover [steps] [px]via rawScroll step by step and capture each page state
screenshot [--full-page]via rawCapture page screenshot (viewport or full-page)
back / forwardvia rawBrowser history with human-like pauses
summaryvia rawPage metadata: URL, title, interactive element count

2. 📄 Structured Extraction

CLI commandExtract types availableDetails
extract articlearticleFull article: title + paragraphs + headings
extract tabletableHTML <table> + ARIA role="grid"
extract formformAll form fields with labels, types, placeholders
extract listingslistingsDirectory/listings: names, ratings, reviews, addresses, phones
extract marketplacemarketplaceE-commerce: titles, prices, images, delivery, sponsored flags
Additional types via JSON-RPCsearch-results, google-maps, custom (CSS/XPath), schemaAccessible via MCP or WebSocket

Common options: --limit=N | --format=json|csv | --out=file.csv

3. 📧 Lead Generation Pipeline

CLI commandAction
scrape-emails <query>🔥 Full pipeline: search engine → visit each result page → extract all emails → save to CSV
Options--limit=N (default 20)
extract-emails <url>Navigate to URL and extract all email addresses found in HTML + visible text
extract-phones <url>Navigate to URL and extract phone numbers (uses libphonenumber-js, supports French format +33)
scrapeExtract marketplace/listing results from current page (alias for extract marketplace)

4. 🔍 Visible Text

CLI commandOptions
visibleText--limit=N

Cherry-pick visible DOM text by filter, or extract only emails/phones from visible content.

5. 🌐 Web Search

CLI commandMCP toolAction
webSearch <query>web_searchSearch Google/Bing, auto-paginate until limit reached, deduplicate results
Options--limit=N (default 10)`--engine=google
siteSearch <query>site_searchDetect and use the current page's search form automatically

6. 🧍 Human Behavior Emulation

Every interaction goes through the human behavior layer. These commands add extra human-like activity:

CLI commandWhat it does
scanRead visible text → scroll → pause → repeat (like a human researcher)
findText <text>Search for visible text on page, auto-scrolling until found or max scrolls reached
clickText <text>Find visible text AND click it — coordinates first, falls back to agent.click(ref)
idle [ms]Pause with random cursor movements — looks human even while "doing nothing"
jitter [radius] [moves]Small cursor hesitation movements (default radius 18px, 4 moves)
skim [steps] [px]Scroll through content with natural reading pauses
backtrackSmall upward scroll + pause (emulates re-reading something)
focusCycle [n]Press Tab through focusable controls with natural pauses
wait [ms]Wait N milliseconds (default: 2000)

7. ⏱️ Timing & Anti-Spam

CLI commandWhat it does
timing getShow current human timing profile: consultSpeed, WPM ranges, pause ranges
timing set key=value ...Adjust any timing parameter live (consultSpeed, focusedWpmMin/Max, etc.)
timing resetRestore default timing profile
antispamCheck current page for anti-bot / anti-spam blocking text — non-throwing

The timing system has 10 adjustable parameters controlling reading speed, pause duration, scroll behavior, and feedback intervals.

8. ⚡ Automation

CLI commandWhat it does
run <cmd1> <args1> ...Chain multiple commands in sequence, preserving browser state between them
batch <recipe.json>Execute multiple commands from a JSON recipe file (lightweight, no variable interpolation)
replInteractive REPL — type commands live, see results instantly
startLaunch the bridge server

9. 🖥️ Live Viewer

Web GUI → http://localhost:8080/viewer

Watch the browser in real time. Debug interactions, inspect the page, take over manually when needed.


🧩 MCP Integration (16 tools)

AgentBridge exposes an MCP server with the following tools:

ToolInputDescription
browser_statusCheck browser connection health, return current URL and page title
navigate{ url, autoAnnotate? }Navigate to any http(s) URL
annotate_page{ noImage? }Return interactive page elements with stable numeric refs + screenshot URL
click_ref`{ ref: numberstring }`
type_ref{ ref, text, clearFirst? }Type text into an element by ref
inspect_formsMap all visible forms: fields, labels, types, options, selectors
fill_form`{ values: {}fields: [] }`
submit_form{ query? }Submit the active form via submit button or Enter key
site_search{ query, field? }Find and use the current page's search form automatically
web_search{ query, engine?, limit?, pages? }Search Google/Bing/DuckDuckGo with auto-pagination and dedup
extract_schema{ schema: { fields } }Extract structured data via CSS selectors or XPath
extract_marketplace{ limit?, format? }Extract e-commerce listing cards with title, price, image, delivery flags
human_timing_getCurrent consultation timing profile (10 parameters)
human_timing_set{ consultSpeed?, wpmMin/Max? }Adjust timing to speed up or slow down browsing
human_antispam_checkCheck page for anti-bot blocking without throwing
browser_command (raw){ type, payload }🔐 Run ANY bridge command (requires BRIDGE_MCP_ALLOW_RAW=1)

Also exposes a resource (agentbridge://api) listing all registered command names, and a prompt template (browser_task) for asking agents to complete browser-based goals.


🛡️ Anti-Detection System

FeatureHow it works (source-verified)
Bezier cursor curvesEvery mouse movement uses cubic Bezier curves with 24-90 adaptive steps — no straight lines
Cursor position trackingServer-side cursor state (_curX, _curY) — every move starts from last known position, never from (0,0)
Click delayRandom 15-60ms delay between mouse down and mouse up
Typing jitterEach character typed with variable per-keystroke delay (15-180ms)
Typing jitter standard dev0.45 of mean delay per character
Timing profile (10 params)consultSpeed (0.25-8), focusedWpmMin (80-500), focusedWpmMax (80-650), skimWpmMin (100-700), skimWpmMax (100-850), minFocusedMs (0-120000), maxFocusedMs (500-180000), minSkimMs, maxSkimMs, feedbackIntervalMs
Stealth scriptPatches navigator.webdriverundefined, injects full window.chrome runtime mock, sets navigator.plugins with PDF viewer entries, spoofs languages, platform, userAgent, WebGL vendor/renderer, navigator.hardwareConcurrency
Cookie auto-acceptAuto-detects and clicks "Accept all" / "Tout accepter" / "Accepter" on Google, cookie banners, and common consent dialogs
Flash clickVisual click indicator (green circle animation) — visible when not headless
Anti-spam checkInspects page for known anti-bot patterns without throwing — returns clean block/unblock status
Human behavior layerEvery interaction runs through humanPreClickhumanMove (Bezier) → click → flashClickassertNoAntiBot

Stealth patches verified in source (src/browser/stealth.ts):

  • navigator.webdriverundefined
  • window.chrome mock with runtime, app, loadTimes, csi
  • navigator.plugins → 5 plugins including "Chrome PDF Plugin", "PDF Viewer", "Chrome PDF Viewer"
  • navigator.languages["en-US", "en"]
  • navigator.platform"Win32"
  • navigator.hardwareConcurrency → 4 or 8
  • WebGL vendor/renderer spoofing
  • __name polyfill for esbuild compatibility
  • Screen resolution and color depth normalization

🏆 Anti-Detection Bypass Comparison

ToolYouTubeGoogle SearchJS-heavy sites
curl / wget❌ 429 / captcha❌ Blocked❌ JS required
Puppeteer-Extra + StealthPlugin❌ "Sign in" overlay❌ Detected⚠️ Partial
Playwright + stealth❌ Blocked❌ Detected⚠️ Partial
Selenium + undetected-chromedriver❌ Blocked❌ Detected❌ Blocked
Camoufox❌ Sign-in prompt⚠️ Partial⚠️ Partial
yt-dlp (no cookies)❌ LOGIN_REQUIREDN/AN/A
Tor Browser❌ Blocked by most❌ CAPTCHA loop❌ Blocked
AgentBridge + Chrome CDPFull accessWorksWorks

Tested on Ubuntu 26.04 with Chromium 149 headless via Chrome DevTools Protocol (CDP fallback mode — works without Playwright)


🔧 Use Cases

  1. Bug Bounty & Security Research — Automate recon on authenticated sessions, bypass WAF, extract findings from JS-heavy apps
  2. Lead Generationscrape-emails "AI consultants France" --limit=100 --out=leads.csv --fast
  3. Web Scraping — Extract structured data from JS-rendered sites (marketplaces, directories, SERPs)
  4. Market Research — Competitor price monitoring, product catalog scraping, directory extraction
  5. AI Training Data — Collect real-world content from protected sites without getting blocked
  6. YouTube Research — Access video metadata, descriptions, comments, and channel data without cookie/login wall
  7. SaaS Automation — Fill multi-step forms, navigate dashboards, extract reports
  8. Data Pipelines — Integrate with Claude, Cursor, Windsurf via MCP, or build custom workflows with run / batch

🚀 Quick Start

# 1. Install from npm
npm install -g browser-agentbridge-ai
npm run build

# 2. Start the bridge server
agentbridge start
# → WebSocket: ws://localhost:8080/ws/browser-bridge
# → Live GUI:  http://localhost:8080/viewer

# 3. Navigate to any page
agentbridge navigate https://example.com

# 4. Analyze the page (returns numbered elements)
agentbridge annotate

# 5. Interact with numbered elements
agentbridge click 3
agentbridge type 5 "hello world"
agentbridge press Enter

# 6. Extract content
agentbridge extract article --format=json
agentbridge extract-emails https://example.com/contact

# 7. Full lead generation pipeline
agentbridge scrape-emails "AI consultants France" --limit=50 --out=leads.csv --fast

# 8. Web search with results
agentbridge webSearch "latest AI security tools 2026" --limit=20 --out=results.json

# 9. Interactive REPL
agentbridge repl

Without Playwright (CDP Fallback)

Works on any system with Chrome — no Playwright dependency:

# Launch Chrome headless with remote debugging
google-chrome --headless=new --no-sandbox --remote-debugging-port=9222 &
sleep 2

# Get the WebSocket URL
WS_URL=$(curl -s http://127.0.0.1:9222/json/version | \
  python3 -c "import sys,json; print(json.load(sys.stdin)['webSocketDebuggerUrl'])")

# Point AgentBridge at the running Chrome instance
export CHROME_CDP_URL="$WS_URL"
export BRIDGE_HEADLESS=true
npx agentbridge start

MCP Integration

# Start the MCP server (stdin/stdout transport for Claude/Cursor/Windsurf)
npm run mcp

Connects 16 tools to any MCP-compatible host.


🔗 Quick Links


📦 Requirements

  • Node.js ^18.0.0
  • Browser: Chromium (via Playwright) OR any Chrome-based browser via CDP fallback
  • RAM: ~100MB for bridge server + browser process
  • Playwright: Optional — the CDP fallback works on systems where Playwright isn't supported (e.g., Ubuntu 26.04)

License

MIT-0 — Free to use, modify, and redistribute. No attribution required.