Agent Browser Assistant
An intelligent browser control assistant providing browser automation, data scraping, and testing capabilities.
Use Cases
Opening web pages, clicking/typing/scrolling, taking screenshots/recordings, extracting web content, exporting table data, automated form filling, batch operations, scheduled tasks, login authentication, UI testing, regression testing.
Quick Start
Use the browser tool for all browser operations:
# Open a web page
browser(action="open", url="https://example.com")
# Take a screenshot
browser(action="screenshot")
# Click an element
browser(action="act", kind="click", ref="button-submit")
# Type text
browser(action="act", kind="type", ref="input-username", text="user@example.com")
# Scroll the page
browser(action="act", kind="scroll", y=500)
# Get a page snapshot
browser(action="snapshot")
Core Capabilities
1. Page Operations
| Operation | Description | Example |
|---|
| open | Open a specified URL | action="open", url="..." |
| snapshot | Get page structure | action="snapshot" |
| screenshot | Take a page screenshot | action="screenshot" |
| navigate | Navigate to a URL | action="navigate", url="..." |
| close | Close a tab | action="close", targetId="..." |
2. Element Interaction
Use the act operation for page interaction:
- click: Click an element (ref: element reference)
- type: Type text (ref: input reference, text: content)
- press: Press a keyboard key (key: key name)
- hover: Hover over an element
- select: Select from a dropdown
- fill: Fill a form (fields: field dictionary)
- scroll: Scroll the page (x/y: coordinates)
3. Data Scraping
Extract data from web pages:
# Get a page snapshot to analyze structure
browser(action="snapshot")
# Extract table data - using selector
browser(action="act", kind="evaluate", selector="table.data", fn="Array.from(document.querySelectorAll('tr')).map(r => Array.from(r.querySelectorAll('td')).map(c => c.innerText))")
4. Automated Workflows
Automated form filling:
browser(action="act", kind="fill", fields=[
{"ref": "input-email", "value": "user@example.com"},
{"ref": "input-password", "value": "password123"}
])
browser(action="act", kind="click", ref="button-login")
Batch operations:
# Iterate through list items
for i in range(1, 6):
browser(action="act", kind="click", ref=f"item-{i}")
5. Testing Capabilities
UI testing scenarios:
- Regression Testing: Verify that page functionality works correctly
- Performance Monitoring: Page load time
- Element Existence Check: Verify that key elements are visible
Advanced Usage
Waiting for Page Load
browser(action="act", kind="wait", loadState="domcontentloaded", timeMs=5000)
Handling Dialogs
browser(action="dialog", kind="accept") # Confirm
# or
browser(action="dialog", kind="dismiss") # Cancel
File Upload
browser(action="upload", ref="input-file", paths=["C:/path/to/file.pdf"])
PDF Export
browser(action="pdf", path="C:/output/page.pdf")
Configuration Options
| Parameter | Description | Default |
|---|
| profile | Browser profile | "openclaw" |
| target | Browser target | "sandbox" |
| slowly | Slow motion mode | false |
| timeoutMs | Timeout duration | 30000 |
Common Selector Patterns
- Button:
button[type="submit"], #submit-btn
- Input:
input[name="email"], #username
- Link:
a[href*="login"]
- Table:
table.data tr
- List:
.item-list li
Notes
- Use
snapshot to get page structure before performing element operations
- Dynamic content may require waiting for it to finish loading
- For logged-in state operations, use
profile="user" to reuse the user's browser
- For large-scale data scraping, consider pagination to avoid timeouts