Agent Browser

Headless browser automation CLI for AI agents. Use when interacting with websites — navigating pages, filling forms, clicking buttons, taking screenshots, ex...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 778 · 1 current installs · 1 all-time installs

by@bodietron

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The declared purpose — headless browser automation — matches the commands and features described (navigation, click/fill, screenshots, downloads, state persistence). However SKILL.md and setup.sh state that Node.js/npm are required while the registry metadata lists no required binaries or env vars, which is an inconsistency the user should be aware of.

Instruction Scope

Runtime instructions include powerful operations: eval'ing arbitrary page JS, connecting to an existing Chrome/CDP, opening file:// URIs, uploading and downloading files, saving/loading session state and auth profiles, and redirecting page text to files. These are expected for a browser-automation tool but also enable access to local files and any data visible in pages the agent visits. The SKILL.md examples reference $USERNAME/$PASSWORD but the skill declares no required env vars.

Install Mechanism

There is no registry install spec; instead scripts/setup.sh performs npm install -g agent-browser and then runs agent-browser install to fetch Chromium. Installing from npm at runtime is normal for Node tools but it means arbitrary package code will be downloaded and executed on the host. The package's source is 'unknown' (no homepage) so the provenance of the npm package is not verified in the metadata.

Credentials

The skill declares no required credentials or env vars, yet SKILL.md references optional envs (AGENT_BROWSER_*), and examples use $USERNAME/$PASSWORD and saving auth profiles and state files. The tool can persist credentials and cookies to disk and may read local files (file://, downloads). Requesting no secrets in metadata but instructing workflows that use and store secrets is a mismatch and elevates risk.

ℹ

Persistence & Privilege

The skill is not always-enabled and does not request elevated platform privileges, but it provides commands to save session state, store auth profiles, and write files to disk. Those behaviors are expected for a browser automation CLI but mean the skill can create persistent artifacts containing sensitive data on the host.

What to consider before installing

This skill appears to implement a real browser-automation CLI, but there are several things to check before installing or running it: - Provenance: There is no homepage and the registry metadata gives an unknown source. Inspect the npm package 'agent-browser' (version author, README, and published files) before running npm install -g. Prefer a package with a verifiable repository and maintainer. - Runtime requirements mismatch: SKILL.md requires Node.js/npm but the registry metadata lists no required binaries—make sure your environment meets the tool's needs and be cautious when running setup scripts. - Data exposure: The tool can access file:// URLs, download/upload files, eval JS in page contexts, and save auth/state to disk. Do not provide sensitive credentials (passwords, API keys, private tokens) to the agent unless you trust the package and have verified where/how those secrets are stored. Consider using dedicated test accounts. - Install safely: Run the setup in an isolated environment (container, VM) so the npm package and its install scripts cannot access your primary host. After installation, inspect installed files under the global npm directory. - Hardening: If you proceed, set AGENT_BROWSER_ALLOWED_DOMAINS to a strict allowlist, enable AGENT_BROWSER_CONTENT_BOUNDARIES, and limit AGENT_BROWSER_MAX_OUTPUT. Review and control any saved state/auth files (their formats and locations). - If you are unsure: Ask the publisher for a repository link and package provenance, or prefer a well-known, audited browser-automation tool (e.g., Playwright/Puppeteer) with clear origin and reproducible install steps.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk9728kaq7a94b9aycj4nk90bkx81zhzs

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Browser Automation with agent-browser

Setup

Run scripts/setup.sh to install agent-browser and Chromium. Requires Node.js.

Core Workflow

Every browser automation follows this pattern:

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
Interact: Use refs to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Command Chaining

Chain with && when you don't need intermediate output:

agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

Run separately when you need to parse output first (e.g., snapshot to discover refs).

Essential Commands

# Navigate
agent-browser open <url>
agent-browser close

# See the page (always do this first)
agent-browser snapshot -i              # Interactive elements with refs
agent-browser snapshot -i -C           # Include onclick divs

# Interact using @refs
agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser select @e1 "option"
agent-browser press Enter
agent-browser scroll down 500

# Get info
agent-browser get text @e1
agent-browser get url
agent-browser get title

# Wait
agent-browser wait @e1                 # For element
agent-browser wait --load networkidle  # For network idle

# Capture
agent-browser screenshot page.png
agent-browser screenshot --full        # Full page
agent-browser pdf output.pdf

For the full command reference, see references/commands.md.

Ref Lifecycle (Important)

Refs (@e1, @e2) are invalidated when the page changes. Always re-snapshot after:

Clicking links/buttons that navigate
Form submissions
Dynamic content loading (dropdowns, modals)

Common Patterns

Form Submission

agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser click @e5
agent-browser wait --load networkidle

Login with State Persistence

agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME" && agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json

# Reuse later
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard

Data Extraction

agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5
agent-browser get text body > page.txt

Screenshot & Diff

agent-browser screenshot baseline.png
# ... changes happen ...
agent-browser diff screenshot --baseline baseline.png

Parallel Sessions

agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser session list

Security (Optional)

export AGENT_BROWSER_CONTENT_BOUNDARIES=1          # Wrap output for AI safety
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com"  # Domain allowlist
export AGENT_BROWSER_MAX_OUTPUT=50000               # Prevent context flooding

Cleanup

Always close sessions when done: agent-browser close

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…