Browser Use Local

v1.0.0

Automate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.

⭐ 0· 1.5k·6 current·7 all-time

by@fengjiajie

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for fengjiajie/browser-use-local.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Browser Use Local" (fengjiajie/browser-use-local) from ClawHub.
Skill page: https://clawhub.ai/fengjiajie/browser-use-local
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install browser-use-local

ClawHub CLI

Package manager switcher

npx clawhub@latest install browser-use-local

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The SKILL.md and included scripts are coherent with a 'browser-use local' helper: CLI examples, screenshot/HTML extraction, QR crop helper, and a small agent runner. The skill lacks a one-line description in the registry metadata, but the code and docs align with the claimed functionality.

Instruction Scope

The runtime instructions and run_agent_kimi.py require an OPENAI_API_KEY and OPENAI_BASE_URL and will call an external LLM (Moonshot/Kimi). That implies page HTML and/or screenshots may be transmitted to that remote endpoint during agent runs — a privacy/data-exfiltration risk for sensitive pages (logins, consoles, QR codes). The SKILL.md does not explicitly warn the user that page content will be sent to the LLM provider.

ℹ

Install Mechanism

There is no install spec (instruction-only), which is low risk in that nothing is fetched automatically. However the Python scripts import packages (Pillow, python-dotenv, browser_use, etc.) and the README expects a virtualenv path to exist. Required dependencies are not declared in registry metadata, so a user may need to install additional packages manually.

Credentials

The registry metadata lists no required environment variables, but the SKILL.md and run_agent_kimi.py explicitly require OPENAI_API_KEY and OPENAI_BASE_URL (and optionally OPENAI_MODEL, etc.). Requesting an LLM API key/base URL is proportionate to running the bundled agent, but the omission from metadata is an inconsistency and a practical risk: secrets must be provided and will be used to contact an external service.

✓

Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide settings. It uses the platform default (agent invocation allowed), which is expected for agent-capable skills.

What to consider before installing

This skill appears to implement local browser automation helpers and small Python utilities; the code is short and readable. Before installing or running it, consider: - Sensitivity: Running the bundled agent (run_agent_kimi.py) requires OPENAI_API_KEY and OPENAI_BASE_URL and will contact the specified LLM provider. Page HTML and screenshots could be sent to that remote endpoint — do not run the agent on pages containing secrets (password fields, private dashboards) unless you trust the provider and its data-handling policy. - Metadata mismatch: The registry lists no required env vars or dependencies, but the SKILL.md and code do require LLM credentials and Python packages (Pillow, python-dotenv, browser_use, etc.). Expect to create/activate a venv and install dependencies manually. - Minimal audit: The included scripts (image crop, base64 extraction, small agent runner) are small and understandable. If you plan to use it, run the non-agent CLI workflows first (they don't require an LLM key) and inspect/execute the Python scripts in an isolated environment. - Hardening suggestions: Only provide OPENAI_API_KEY/OPENAI_BASE_URL to this skill if you trust the endpoint; consider using an account with limited privileges, or run the agent in an isolated VM/container. Ask the publisher to update registry metadata to declare required env vars and dependencies so you can make an informed decision.

Like a lobster shell, security has layers — review code before you run it.

latestvk976w3spjx8qf0zzds9t7kfncd80jqme

1.5kdownloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

browser-use (local) playbook

Default constraints in this environment

Prefer browser-use (CLI/Python) over OpenClaw browser tool here; OpenClaw browser may fail if no supported system browser is present.
Use persistent sessions to do multi-step flows: --session <name>.

Quick CLI workflow (non-agent)

Open

browser-use --session demo open https://example.com

Inspect (sometimes state returns 0 elements on heavy/JS sites)

browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'

Screenshot (always works; best debugging primitive)

browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png

HTML for link discovery (works even when state is empty)

browser-use --session demo --json get html > /tmp/page_html.json
python3 - <<'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'<>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
    print(u)
PY

Lightweight DOM queries via JS (useful when state is empty)

browser-use --session demo --json eval "location.href"
browser-use --session demo --json eval "document.title"

Agent workflow with OpenAI-compatible LLM (Moonshot/Kimi)

Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.

Minimal working Kimi example

Create .env (or export env vars) with:

OPENAI_API_KEY=...
OPENAI_BASE_URL=https://api.moonshot.cn/v1

Then run the bundled script:

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py

Kimi/Moonshot quirks observed in practice (fixes):

temperature must be 1 for kimi-k2.5.
frequency_penalty must be 0 for kimi-k2.5.
Moonshot can reject strict JSON Schema used for structured output. Enable:
- remove_defaults_from_schema=True
- remove_min_items_from_schema=True

If you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.

QR code extraction (login/demo pages)

Preferred order

Screenshot the page and crop candidate regions (fast, robust).
If HTML contains data:image/png;base64,..., extract and decode it.

Crop candidates

Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot.

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python skills/browser-use-local/scripts/crop_candidates.py \
  --in /home/node/.openclaw/workspace/login.png \
  --outdir /home/node/.openclaw/workspace/qr_crops

Extract base64-embedded images from HTML

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
browser-use --session demo --json get html > /tmp/page_html.json
python skills/browser-use-local/scripts/extract_data_images.py \
  --in /tmp/page_html.json \
  --outdir /home/node/.openclaw/workspace/data_imgs

Troubleshooting

state shows elements: 0: use get html + regex discovery, plus screenshots; use eval to query DOM.
Page readiness timeout warnings: usually harmless; rely on screenshot + HTML.
CLI flags order: global flags go before the subcommand:
- ✅ browser-use --browser chromium --json open https://...
- ❌ browser-use open https://... --browser chromium

Comments

Loading comments...