Install
openclaw skills install pinchtabClawHub Security found sensitive or high-impact capabilities. Review the scan results before using.
Use this skill when a task needs browser automation through PinchTab: open a website, inspect interactive elements, click through flows, fill out forms, scrape page text, log into sites with a persistent profile, export screenshots or PDFs, manage multiple browser instances, or fall back to the HTTP API when the CLI is unavailable. Prefer this skill for token-efficient browser work driven by stable accessibility refs such as `e5` and `e12`.
openclaw skills install pinchtabCLI-first browser skill. Use pinchtab commands.
export PINCHTAB_SESSION=$(pinchtab session create --agent-id myagent) — do this once before any browser command.pinchtab nav <url> --snap — auto-starts the local server if needed, then returns tab ID + interactive snapshot in one call.pinchtab click <ref> --snap-diff — returns OK + only changed elements (most token-efficient).pinchtab text when you won't act on refs.Key optimization: Use --snap-diff on click, fill, select, back, forward, reload to get only added/changed/removed elements — most token-efficient for multi-step flows. Use --snap when you need the full snapshot (e.g., first navigation, or after major page changes). Use --text when you need prose content for verification (skips snap, returns page text directly).
--snap-diff returns the same compact format as snap, but with change markers and a header showing counts:
# Page Title | URL | 57 nodes | +2 ~1 -0
e0:link "Home"
e5:button "Submit" [+]
e12:textbox val="updated" [~]
# removed: e99
[+] = added, [~] = changed, removed refs listed at end. All valid refs are shown — no need to remember previous snapshot. Do not follow with redundant snap; only call text when you need prose content.
Fallback observation (when --snap wasn't used):
pinchtab snap — interactive elements + headings in compact format (default).pinchtab snap [selector] — scope the current-tab snapshot to one element.pinchtab snap --full — all nodes as JSON (for debugging).pinchtab text — content only (use when snap is missing prose you need).Rules: only nav <url> auto-starts the default local server; snap, text, html, find, and action commands operate on an already-running server/current tab. Explicit --server targets are never auto-started. Never act on stale refs; screenshots only for visual/debug; choose the instance/profile up front for parallel or multi-site work.
text, snap, find. Only use eval, download, upload when a simpler command cannot accomplish the task.pinchtab cookies) contains session credentials — do not log, copy, or expose cookie values to untrusted contexts. Use only when the task specifically requires cookie inspection.pinchtab network-export) may contain private URLs, auth tokens, and response bodies. Omit --body for sensitive sessions. Delete or redact export files after use.Unified selectors accepted by any element-targeting command:
e5 — from snapshot cache (fastest).#login, .btn, [data-testid="x"] — document.querySelector.xpath://button[@id="submit"] — CDP search.text:Sign In — visible text match.find:login button — natural language via /find.Auto-detection: bare eN→ref, #/./[...]→CSS, //→XPath. Use explicit css:/xpath:/text:/find: prefixes when ambiguous. HTTP API uses the same syntax in the selector field (legacy ref still accepted).
&& when you don't need intermediate output (pinchtab nav <url> --snap && pinchtab click e3 --snap-diff). Run separately when you must read refs before acting.
Pages showing "Just a moment..." etc.: POST /solve {"maxAttempts":3} (or /tabs/TAB_ID/solve). Returns immediately if no challenge is present. See api.md.
Requires explicit user approval. Do not call /solve or enable stealth features without the user confirming that challenge-solving is needed for the current task. Never enable stealthLevel without user consent.
Patterns: (1) one-off pinchtab instance start; (2) reuse profile instance start --profile work --mode headed, switch to headless after login; (3) HTTP POST /profiles then POST /profiles/<name>/start; (4) human-assisted headed login, agent reuses headless. Agent sessions: pinchtab session create --agent-id <id> or POST /sessions → set PINCHTAB_SESSION=ses_....
Session reuse safety: When reusing authenticated browser sessions established by a human, use a dedicated low-privilege profile — not the user's personal browsing profile. Confirm with the user before performing account-changing actions (password changes, payment, deletion, permissions) in a reused session. Restrict navigation to the sites needed for the task.
Config file: ~/.pinchtab/config.json. Edit it directly to change settings — no need for PINCHTAB_CONFIG or temp files.
pinchtab config show # view current config
pinchtab security # review security posture
Key settings agents may need to change:
security.allowEvaluate: enable eval command (true/false)security.allowedDomains: list of allowed hostnames (e.g. ["localhost", "127.0.0.1"])instanceDefaults.headless: run Chrome headless (true) or headed (false)pinchtab server | daemon install | health
pinchtab instances | profiles
pinchtab --server http://localhost:9868 snap -i -c # target a specific instance
pinchtab server prints READY to stdout when the browser instance is up and ready to accept commands. Read its output — it includes hints on how to get started (session creation, first nav).
pinchtab nav <url> # auto-starts default local server; flags: --snap, --new-tab, --tab <id>, --block-images, --block-ads, --dismiss-banners, --print-tab-id
pinchtab back | forward | reload # all support --snap, --snap-diff, --text, --dismiss-banners
pinchtab tab # list tabs
pinchtab tab <tab-id> # focus tab
pinchtab nav <url> --new-tab # force another tab
pinchtab tab close <tab-id>
pinchtab instance navigate <instance-id> <url>
Anonymous commands share a single current tab — if anything else navigates that tab, your next command hits the wrong page. Always create a session before your first nav:
export PINCHTAB_SESSION=$(pinchtab session create --agent-id myagent)
All subsequent commands use that session's dedicated tab automatically — no --new-tab or --tab <id> needed.
pinchtab snap [selector] # default: compact + interactive; flags: --full (JSON), -d (diff), --selector <css>, --max-tokens <n>
pinchtab text # Readability-filtered page text
pinchtab text --full # raw document.body.innerText (alias: --raw)
pinchtab text <selector> # ref / -s CSS / xpath:... — text from one element
pinchtab text --json # full JSON (url/title/truncated)
pinchtab find <query> # semantic search; --ref-only for just the ref
Guidance:
snap — default observation (compact + interactive). Returns interactive elements + headings. Prefer this over separate text calls.snap --full — all nodes as JSON; for debugging or when you need the full tree.snap -d — standalone diff from previous snapshot. Use only when you need a diff without performing an action; for any click/fill/select/back/forward/reload, --snap-diff on the action itself already gives you the authoritative post-action state.text — reading articles/dashboards when you won't act on refs. Falls back to --full when Readability drops content you need.text <selector> — read one element without pulling the whole page.find <query> — skip the snapshot when you can describe the target in a phrase. --ref-only pipes straight into click/fill/type.snap -i and full snap are numbered differently — do not mix; re-snapshot before acting if you switched modes.--block-images on nav for read-heavy tasks. Reserve screenshots/PDFs for visual verification.All interaction commands accept unified selectors (see Selectors above).
pinchtab click <selector> # flags: --snap, --snap-diff, --text, --wait-nav, --dismiss-banners (with --wait-nav), --x/--y (coords), --dialog-action accept|dismiss [--dialog-text "..."]
pinchtab dblclick <selector>
pinchtab mouse move|down|up <selector|x y> # --button left|middle|right
pinchtab mouse wheel <ms> --dx <n> --dy <n>
pinchtab drag <from> <to> # or: drag <selector> --drag-x <n> --drag-y <n>
pinchtab type <selector> <text> # keystroke events
pinchtab fill <selector> <text> # set value directly; flags: --snap, --snap-diff, --text
pinchtab press <key> # Enter, Tab, Escape, ...
pinchtab hover <selector>
pinchtab select <selector> <value|text> # flags: --snap, --snap-diff, --text; matches value attr, falls back to visible text
pinchtab scroll <pixels|direction|selector> # `scroll 1500`, `scroll down`, `scroll '#footer'`
Rules:
OK; use --json for recovery metadata. Errors go to stderr as ERROR: <cmd>: <reason>.--snap-diff with click, fill, select, back, forward, reload — returns OK + only changed elements. Use --snap when you need the full snapshot (first nav, major page change).fill for form entry; type only when the site depends on keystroke events.click --wait-nav when a click navigates. May return {"success":true} or Error 409: unexpected page navigation — treat 409 as success and verify with fresh snap/text.--dismiss-banners on nav/back/forward/reload (and on click --wait-nav) runs a best-effort pass that clicks a visible Accept all / Got it / OK / Close / Dismiss button, or removes obvious cookie/consent/dialog/overlay containers. Use when a fresh page-load shows a modal that blocks interaction (typical symptom: Error 500: action click: element is occluded). Heuristic — can misfire on pages that label legitimate UI as overlay or modal; not a substitute for an explicit selector when one is known.mouse only for drag handles, canvas widgets, or exact pointer sequences.--dialog-action accept|dismiss, --dialog-text for prompt() responses."scrollX"/"scrollY" for pixel deltas, "selector" to scroll into view — x/y are viewport coords, not deltas.GET /download?url=... returns JSON {contentType, data (base64), size, url}; only http/https; private/internal hosts blocked unless in security.downloadAllowedDomains.Use for async DOM settling (spinners, toasts, XHR).
pinchtab wait <selector> # default: visible; --state hidden to wait for disappear
pinchtab wait --text "..." | --not-text "..." # text appear / disappear
pinchtab wait --url "**/dashboard" # glob: **, *, ?
pinchtab wait --load ready-state|content-loaded|network-idle
pinchtab wait --fn "window.dataReady === true" # requires security.allowEvaluate
pinchtab wait 500 # fixed ms delay (last resort, max 30000ms)
Timeout 10s default, 30s max via --timeout <ms>. Prefer --not-text/--state hidden over polling.
pinchtab screenshot [-o path.png] [-q <jpeg-quality>] # format by extension
pinchtab pdf [-o path.pdf] [--landscape]
These operations are high-impact and gated by security policy. Do not use unless the task specifically requires them and simpler commands are insufficient.
pinchtab eval "document.title" # --await-promise for async; requires security.allowEvaluate: true
pinchtab download <url> -o /tmp/out.bin # requires security.allowDownloads: true
pinchtab upload /absolute/path -s <css> # requires security.allowUploads: true
eval: narrow read-only DOM inspection unless user asks for mutation. Blocked by default (security.allowEvaluate: false).download: prefer temp/workspace path over arbitrary filesystem. Blocked by default.upload: path must be user-provided or clearly approved. Blocked by default.Use curl only when the CLI is unavailable. See api.md for full endpoint reference.
nav --snap → fill <ref> <text> --snap-diff per field → click --wait-nav --snap-diff submit → verify with text. Always click submit; never press Enter.click --snap-diff to get only changed refs with each action — most token-efficient for flows with many steps.click "text:Accept", fill "#search" "q".text confirms success messages / navigation outcomes. Default is Readability-filtered; may drop nav, repeated headlines, short-text nodes, or collapse lists. Use text --full (raw document.body.innerText) when verifying list/grid/tab/accordion pages, the marker is short, or a default read came back missing content you saw in snap.{"clicked":true,"submitted":true} means the event fired, not that the server accepted or HTML validation passed. Verify via snap/text — or use --snap-diff on the action itself, which already reflects the post-event page state.pinchtab frame <target> sets a stateful scope inherited by subsequent selector-based snap/action/text calls. Target accepts main, an iframe ref, CSS for the iframe, a frame name, or a URL. Nested iframes need multiple hops. Full snap (no -i) flattens same-origin iframe descendants and ref-based actions work across the boundary. Cross-origin iframes aren't exposed as scopes — fall back to eval against iframe.contentDocument. text --frame <frameId> takes a 32-char hex frameId (from pinchtab frame output), not a CSS selector. One-shot read idiom: FID=$(pinchtab frame '#f' | jq -r .current.frameId); pinchtab frame main; pinchtab text --full --frame "$FID".eval → always IIFE when introducing identifiers. Top-level const/let/class collide across calls in the shared realm (SyntaxError: Identifier 'x' has already been declared). Also needed to project DOMRect into a JSON-serializable object: pinchtab eval "(() => { const r = document.querySelector('#x').getBoundingClientRect(); return {x: r.x, y: r.y, w: r.width, h: r.height}; })()". Single expressions without identifiers (document.title) are fine bare.text reads hidden nodes: both default and --full include display:none / visibility:hidden content because they read raw DOM. To confirm something is actually visible, use snap (accessibility tree respects visibility) or eval against offsetHeight / getComputedStyle().display. Common trap: pre-seeded hidden success <div> reported by text before submission.<option> by visible text, not value. select accepts either; only eval + Array.from(select.options) to debug a no-match.text:<value> selectors use JS-level search and can flake with DOM Error / context deadline exceeded on large pages. Prefer refs from a fresh snap -i -c — they resolve by backend node IDs.snap -i -c skips non-interactive descendants. For iframe interiors set a frame scope or use full snap.aria-expanded is usually on the outer container of accordions/menus, not the click trigger. Verify via the wrapper's attribute.