Agent Browser Assistant

Dev Tools

For browser automation tasks, web data scraping, form filling, page screenshots, UI testing, and more.

Install

openclaw skills install agent-browser-assistant

Agent Browser Assistant

An intelligent browser control assistant providing browser automation, data scraping, and testing capabilities.

Use Cases

Opening web pages, clicking/typing/scrolling, taking screenshots/recordings, extracting web content, exporting table data, automated form filling, batch operations, scheduled tasks, login authentication, UI testing, regression testing.

Quick Start

Use the browser tool for all browser operations:

# Open a web page
browser(action="open", url="https://example.com")

# Take a screenshot
browser(action="screenshot")

# Click an element
browser(action="act", kind="click", ref="button-submit")

# Type text
browser(action="act", kind="type", ref="input-username", text="user@example.com")

# Scroll the page
browser(action="act", kind="scroll", y=500)

# Get a page snapshot
browser(action="snapshot")

Core Capabilities

1. Page Operations

OperationDescriptionExample
openOpen a specified URLaction="open", url="..."
snapshotGet page structureaction="snapshot"
screenshotTake a page screenshotaction="screenshot"
navigateNavigate to a URLaction="navigate", url="..."
closeClose a tabaction="close", targetId="..."

2. Element Interaction

Use the act operation for page interaction:

  • click: Click an element (ref: element reference)
  • type: Type text (ref: input reference, text: content)
  • press: Press a keyboard key (key: key name)
  • hover: Hover over an element
  • select: Select from a dropdown
  • fill: Fill a form (fields: field dictionary)
  • scroll: Scroll the page (x/y: coordinates)

3. Data Scraping

Extract data from web pages:

# Get a page snapshot to analyze structure
browser(action="snapshot")

# Extract table data - using selector
browser(action="act", kind="evaluate", selector="table.data", fn="Array.from(document.querySelectorAll('tr')).map(r => Array.from(r.querySelectorAll('td')).map(c => c.innerText))")

4. Automated Workflows

Automated form filling:

browser(action="act", kind="fill", fields=[
    {"ref": "input-email", "value": "user@example.com"},
    {"ref": "input-password", "value": "password123"}
])
browser(action="act", kind="click", ref="button-login")

Batch operations:

# Iterate through list items
for i in range(1, 6):
    browser(action="act", kind="click", ref=f"item-{i}")

5. Testing Capabilities

UI testing scenarios:

  • Regression Testing: Verify that page functionality works correctly
  • Performance Monitoring: Page load time
  • Element Existence Check: Verify that key elements are visible

Advanced Usage

Waiting for Page Load

browser(action="act", kind="wait", loadState="domcontentloaded", timeMs=5000)

Handling Dialogs

browser(action="dialog", kind="accept")  # Confirm
# or
browser(action="dialog", kind="dismiss")  # Cancel

File Upload

browser(action="upload", ref="input-file", paths=["C:/path/to/file.pdf"])

PDF Export

browser(action="pdf", path="C:/output/page.pdf")

Configuration Options

ParameterDescriptionDefault
profileBrowser profile"openclaw"
targetBrowser target"sandbox"
slowlySlow motion modefalse
timeoutMsTimeout duration30000

Common Selector Patterns

  • Button: button[type="submit"], #submit-btn
  • Input: input[name="email"], #username
  • Link: a[href*="login"]
  • Table: table.data tr
  • List: .item-list li

Notes

  1. Use snapshot to get page structure before performing element operations
  2. Dynamic content may require waiting for it to finish loading
  3. For logged-in state operations, use profile="user" to reuse the user's browser
  4. For large-scale data scraping, consider pagination to avoid timeouts