Fast Agent Browser

v1.1.0

Python CLI tool for AI agents to automate web browsers with Playwright, supporting navigation, interaction, snapshots, screenshots, and form handling.

⭐ 0· 96·0 current·0 all-time

byClawMem.com@leohuang8688

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for leohuang8688/fast-agent-browser.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Fast Agent Browser" (leohuang8688/fast-agent-browser) from ClawHub.
Skill page: https://clawhub.ai/leohuang8688/fast-agent-browser
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install fast-agent-browser

ClawHub CLI

Package manager switcher

npx clawhub@latest install fast-agent-browser

Security Scan

VirusTotal

Pending

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

Name/description match the code: this is a Python CLI wrapper around Playwright for browser automation, which justifies the declared dependencies (playwright, click, etc.). However several docs and contribution notes reference unrelated tooling (CONTRIBUTING.md suggests `npm install -g agent-browser@latest` and Node.js/Click tests reference a click-based CLI) while the provided CLI is a simple Python main() — this inconsistency suggests sloppy packaging or copy-pasted documentation, not aligned with a clean release.

Instruction Scope

SKILL.md explicitly instructs running `pip3 install -r requirements.txt` and `python3 agent_browser.py install` (which runs `python -m playwright install`), which will download browser binaries. The runtime instructions and README assert 'All user inputs are sanitized', but the source contains no robust sanitization (e.g., wait() interpolates text directly into a JS string). SKILL.md also contains a pre-scan 'unicode-control-chars' prompt-injection signal. The docs encourage saving/loading persistent profiles (filesystem paths) which could expose local files if implemented; the actual code has many TODOs/NotImplemented and some truncated/buggy sections, meaning behavior may be incomplete or unpredictable.

ℹ

Install Mechanism

There is no formal install spec in the registry; SKILL.md asks users to pip-install requirements and run Playwright's browser installer. That will download binary browser artifacts from Playwright's distribution hosts — expected for this functionality but networked and writes files to the user's environment. The repository does not pull code from unknown external URLs, but running the Playwright installer and pip on untrusted code should be treated as potentially risky.

✓

Credentials

The skill declares no required environment variables or credentials, which is proportionate for a local browser automation tool. The docs reference persistent profiles and saving state to disk (paths like ~/.myapp-profile), which could read/write user files if implemented; the current code has limited or missing implementation for these features, so there is no explicit credential demand, but file access is possible via normal CLI usage (screenshots, saved state).

✓

Persistence & Privilege

The skill does not request always:true or other elevated persistent privileges. It runs as a normal user-level CLI and will only be active when invoked. The CLI 'install' command runs Playwright's installer (downloads browsers), but that is normal for Playwright usage and not an autonomy privilege within the platform.

Scan Findings in Context

[unicode-control-chars] unexpected: The SKILL.md pre-scan detected unicode control characters (a common prompt-injection technique). This is not expected for straightforward CLI docs and could be an attempt to manipulate LLM parsing or evaluation. Even if benign, it warrants manual inspection of the SKILL.md and other textual files for hidden characters or obfuscated instructions.

What to consider before installing

What to consider before installing or running this skill: - Code/Docs mismatch: The repository mixes Python CLI code with docs that reference Node/npm and a Click-based CLI; tests also import a click CLI that doesn't exist. This suggests the package may be copy-pasted or incomplete. Don't assume the documentation accurately describes runtime behavior. - Review the source locally before running installs: pip will install dependencies and Playwright's 'install' step will download browser binaries from the network. Inspect requirements.txt and agent_browser.py (which shells out to `python -m playwright install`) and run these steps in an isolated environment (container, VM, or dedicated sandbox) first. - Prompt-injection signal: The SKILL.md contained unicode control characters flagged by a scanner. Open the file in a hex/visible-character viewer and remove or inspect any hidden characters before using it; treat the docs as potentially manipulated. - Input handling is not proven: The README claims inputs are sanitized, but the code shows direct interpolation into page scripts (e.g., wait() uses document.body.innerText.includes('{text}')). This is a bug and could break for special input; avoid passing secrets or sensitive data into the skill until sanitization/escaping is confirmed. - File access and persistent state: The docs mention saving/loading state and profiles (filesystem paths). If you will use these features, verify the implementation path and ensure it won't overwrite or read unexpected user files. Prefer specifying explicit paths to dedicated directories. - Run in a sandbox first: Execute the install and some basic commands (open a known site, take a screenshot) in an isolated environment, and watch for unexpected network traffic or attempts to access unrelated files. - When in doubt, ask the publisher for provenance: The skill's source/homepage is unknown and owner metadata is opaque. If you need this skill for production use, request a release from a known repository or a maintainer with verifiable identity. If you want, I can point out the exact lines in the code that are inconsistent or unsafe (e.g., the wait() JS interpolation, truncated/unfinished code paths, and the locations of the prompt-control characters) so you can inspect or patch them prior to running.

✗

src/browser.py:398

Dynamic code execution detected.

Patterns worth reviewing

These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Like a lobster shell, security has layers — review code before you run it.

latestvk9734v1hsbjracn26ycc9fz2r983mftt

96downloads

0stars

1versions

Updated 1mo ago

v1.1.0

MIT-0

Agent Browser Skill

Fast, Python-based browser automation CLI for AI agents

Overview

Agent Browser is a browser automation tool designed for AI agents. It provides a simple CLI interface to control web browsers using Playwright.

Features

Fast CLI for browser automation
AI-friendly snapshot command
Full page interaction (click, fill, type, etc.)
Semantic element finding (role, text, label, etc.)
Smart waiting (element, text, URL, network)
Screenshot and PDF support
File upload support
JavaScript execution
Cookie and storage management

Installation

cd ~/.openclaw/workspace/skills/agent-browser

# Install Python dependencies
pip3 install -r requirements.txt

# Install Playwright browsers
python3 agent_browser.py install

Basic Usage

Open a URL

python3 agent_browser.py open https://example.com

Get Page Snapshot

# Full accessibility tree
python3 agent_browser.py snapshot

# Interactive elements only
python3 agent_browser.py snapshot -i

# Compact output
python3 agent_browser.py snapshot -c

Interact with Elements

# Click element
python3 agent_browser.py click "#submit"

# Fill input field
python3 agent_browser.py fill "#email" "test@example.com"

# Type text
python3 agent_browser.py type "#search" "query"

Get Information

# Get text content
python3 agent_browser.py get_text "#title"

# Get HTML
python3 agent_browser.py get_html "#content"

# Get current URL
python3 agent_browser.py get_url

# Get page title
python3 agent_browser.py get_title

Take Screenshot

# Normal screenshot
python3 agent_browser.py screenshot page.png

# Full page screenshot
python3 agent_browser.py screenshot page.png --full

Wait for Elements

# Wait for element
python3 agent_browser.py wait "#loader" --state hidden

# Wait for text
python3 agent_browser.py wait --text "Welcome"

# Wait for network idle
python3 agent_browser.py wait --load networkidle

Find Elements

# Find by role
python3 agent_browser.py find --role button --name "Submit"

# Find by text
python3 agent_browser.py find --text "Sign In"

# Find by label
python3 agent_browser.py find --label "Email"

Close Browser

python3 agent_browser.py close

Advanced Usage

Form Automation

# Fill form
python3 agent_browser.py fill "#name" "John Doe"
python3 agent_browser.py fill "#email" "john@example.com"

# Select dropdown
python3 agent_browser.py select "#country" "US"

# Check checkbox
python3 agent_browser.py check "#terms"

# Submit form
python3 agent_browser.py click "#submit"

File Upload

python3 agent_browser.py upload "#file" file1.txt file2.txt

Scroll Page

# Scroll down
python3 agent_browser.py scroll down 500

# Scroll up
python3 agent_browser.py scroll up 100

# Scroll element
python3 agent_browser.py scroll down 200 --selector "#main"

Execute JavaScript

python3 agent_browser.py eval "document.title"
python3 agent_browser.py eval "window.innerWidth"

Get Element Info

# Get input value
python3 agent_browser.py get_value "#email"

# Get attribute
python3 agent_browser.py get_attr "#link" href

# Get bounding box
python3 agent_browser.py get_box "#element"

# Count elements
python3 agent_browser.py count ".item"

Options

Global Options

# Headless mode (default)
python3 agent_browser.py open https://example.com --headless

# Show browser window
python3 agent_browser.py open https://example.com --headed

# Custom viewport
python3 agent_browser.py open https://example.com --viewport 1920x1080

Snapshot Options

# Interactive elements only
python3 agent_browser.py snapshot -i

# Compact output
python3 agent_browser.py snapshot -c

# Limit depth
python3 agent_browser.py snapshot -d 3

Screenshot Options

# Full page
python3 agent_browser.py screenshot page.png --full

# Annotate with labels
python3 agent_browser.py screenshot page.png --annotate

AI Workflow

Optimal AI Agent Workflow

# 1. Navigate to page
python3 agent_browser.py open https://example.com

# 2. Get snapshot with refs
python3 agent_browser.py snapshot -i

# 3. AI identifies target elements

# 4. Execute actions
python3 agent_browser.py click "@e1"
python3 agent_browser.py fill "@e2" "input text"

# 5. Get new snapshot if page changed
python3 agent_browser.py snapshot -i

Examples

Example 1: Login Flow

# Open login page
python3 agent_browser.py open https://example.com/login

# Fill credentials
python3 agent_browser.py fill "#email" "user@example.com"
python3 agent_browser.py fill "#password" "secret"

# Click submit
python3 agent_browser.py click "#submit"

# Wait for dashboard
python3 agent_browser.py wait --url "**/dashboard"

# Take screenshot
python3 agent_browser.py screenshot dashboard.png

Example 2: Data Extraction

# Open page
python3 agent_browser.py open https://example.com/products

# Get product titles
python3 agent_browser.py get_text ".product-title"

# Get prices
python3 agent_browser.py get_text ".product-price"

# Take screenshot
python3 agent_browser.py screenshot products.png

Example 3: Form Submission

# Open form
python3 agent_browser.py open https://example.com/contact

# Fill fields
python3 agent_browser.py fill "#name" "John Doe"
python3 agent_browser.py fill "#email" "john@example.com"
python3 agent_browser.py fill "#message" "Hello!"

# Select dropdown
python3 agent_browser.py select "#subject" "Support"

# Check terms
python3 agent_browser.py check "#terms"

# Submit
python3 agent_browser.py click "#submit"

# Wait for confirmation
python3 agent_browser.py wait --text "Thank you"

Security Notes

Input Sanitization

All user inputs are sanitized before use:

Selectors are validated
Text inputs are escaped
URLs are validated
JavaScript execution requires explicit command

Safe Commands

All commands are safe and do not execute arbitrary code:

No shell injection possible
No command injection possible
All inputs are validated

Best Practices

Use headless mode for automation
Validate all inputs before use
Use explicit selectors
Close browser when done
Use timeouts for waits

Troubleshooting

Browser Does Not Open

# Install Playwright browsers
python3 agent_browser.py install

Element Not Found

# Check if element exists
python3 agent_browser.py is_visible "#element"

# Get snapshot to verify
python3 agent_browser.py snapshot -i

Screenshot Is Blank

# Wait for page to load
python3 agent_browser.py wait --load networkidle

# Take screenshot after wait
python3 agent_browser.py screenshot page.png

Timeout Errors

# Increase timeout
python3 agent_browser.py wait "#element" --timeout 60000

API Reference

For detailed API documentation, see docs/api.md.

BrowserAgent Class

from src.browser import BrowserAgent

# Initialize
agent = BrowserAgent(headless=True)

# Navigate
agent.open("https://example.com")

# Get snapshot
tree = agent.snapshot(interactive=True)

# Interact
agent.click("#submit")
agent.fill("#email", "test@test.com")

# Get info
text = agent.get_text("#title")
html = agent.get_html("#content")

# Screenshot
agent.screenshot("page.png")

# Close
agent.close()

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Open a Pull Request

License

MIT License - See LICENSE file for details.

Support

For issues and questions:

GitHub: https://github.com/leohuang8688/agent-browser
Documentation: See README.md and docs/api.md

Happy Automating!

Comments

Loading comments...