Gbrow

Other

Full-featured headless browser for OpenClaw agents. Navigate, snapshot with accessibility tree (@ref clicks), tabs, JS execution, cookie import. No vision model needed — free, fast, reliable.

Install

openclaw skills install gbrow

Gbrow — The Browser Your AI Agent Actually Needs

A full-featured headless browser powered by Playwright and Bun. Uses the accessibility tree for page reading — not expensive vision models.

Why Gbrow?

Traditional (screenshots + vision)Gbrow (accessibility tree)
Screenshot → upload to GPT-4o → wait → readariaSnapshot() → instant structured text
~$0.01 per page readFree
3-10 seconds per page< 100ms
Fails on API key issuesAlways works
Click by fragile CSS selectorClick by @ref (@e1, @e2, etc.)

Quick Setup

# Clone and install
git clone https://github.com/ashish797/Gbrow.git ~/.openclaw/workspace/skills/Gbrow
cd ~/.openclaw/workspace/skills/Gbrow
bash setup.sh

Or one-liner:

curl -fsSL https://raw.githubusercontent.com/ashish797/Gbrow/main/setup.sh | bash

How It Works

1. Start the server

cd ~/.openclaw/workspace/skills/Gbrow
bun run src/server.ts

2. Read the page (accessibility tree)

The snapshot gives you a structured view with clickable refs:

@e1 [heading] "Welcome" [level=1]
@e2 [link] "Get Started"
@e3 [button] "Sign in"
@e4 [textbox] "Search"

3. Click by ref

click @e2     → clicks "Get Started"
fill @e4 "query"  → types into search box

Commands

Navigation

CommandDescriptionExample
goto <url>Navigate to URLgoto https://example.com
backHistory backback
forwardHistory forwardforward
reloadReload pagereload
urlPrint current URLurl

Reading

CommandDescriptionExample
snapshotAccessibility tree with @refssnapshot -i (interactive only)
textCleaned page texttext
html [selector]Raw HTMLhtml .article
linksAll links as "text → href"links
formsForm fields as JSONforms

Interaction

CommandDescriptionExample
click <ref>Click elementclick @e3
fill <ref> <text>Fill inputfill @e4 "hello"
select <ref> <value>Select dropdownselect @e5 "option1"
type <ref> <text>Type with keyboardtype @e4 "search term"
press <key>Press keypress Enter
scroll <direction>Scroll pagescroll down

Inspection

CommandDescriptionExample
js <expr>Run JavaScriptjs document.title
css <sel> <prop>Computed CSScss .box color
attrs <ref>Element attributesattrs @e1
is <prop> <ref>State checkis visible @e3

Tabs

CommandDescription
tabsList open tabs
tab NSwitch to tab N
newtabOpen new tab
closetabClose current tab

Visual

CommandDescription
screenshotTake screenshot
responsive <w> <h>Set viewport size
pdfSave page as PDF

Snapshot Flags

FlagDescription
-iInteractive elements only (buttons, links, inputs)
-cCompact (remove empty structural nodes)
-d NLimit tree depth
-s <sel>Scope to CSS selector
-DDiff against previous snapshot
-aAnnotated screenshot with ref overlays

HTTP API

All commands go through the HTTP API:

# Get port and token from state file
PORT=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['port'])")
TOKEN=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['token'])")

# Send command
curl -s -X POST "http://127.0.0.1:${PORT}/command" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"command":"goto","args":["https://example.com"]}'

Architecture

┌─────────────┐     HTTP      ┌──────────────────┐
│  OpenClaw   │ ──────────▶  │  Gbrow Server    │
│  Agent      │              │  (Bun + Playwright)│
└─────────────┘              └────────┬─────────┘
                                      │
                                      ▼
                              ┌──────────────────┐
                              │  Chromium         │
                              │  (headless)       │
                              └──────────────────┘
                                      │
                                      ▼
                              ┌──────────────────┐
                              │ Accessibility     │
                              │ Tree (ariaSnapshot)│
                              └──────────────────┘

No vision models. No API calls. Just structured text from the browser's accessibility layer.

Credits

Built on top of gstack by Gary Tan (Y Combinator). Adapted for OpenClaw with permission under MIT license.

License

MIT