Install
openclaw skills install magicbrowseBrowser automation fallback through the magicbrowse CLI with goal-driven act as the default primitive and observe/primitives only for recovery, with changed page state verified by fresh observation.
openclaw skills install magicbrowseUse magicbrowse to reach a target page when your runtime's own
page-control tool cannot do it reliably. "Page-control tool"
means a tool that drives browser pages programmatically and reports page state
back — not the user's desktop browser, and not screen-control of a
browser window. The planner runs two LLM loops per task and is slower
than direct browser control; prefer your own page-control tool
when it suffices. Use magicbrowse to reach a target page (search, navigation,
traversal through non-sensitive screens). At any login, identity, checkout,
donation, subscription, payment, or human-verification page, stop and surface
to the user — do not invent or type credentials, identity data, payment data,
or any value you do not legitimately have.
For a MagicPay product/payment workflow, use the MagicPay workflow-first recipe instead of treating a standalone MagicBrowse browser as the product parent: MagicPay starts the product session, then launches or attaches the browser as a child resource.
Try in order. Do not start at layer 4 just because primitives exist.
magicbrowse act "<goal>" — DOM-only navigator.magicbrowse act "<goal>" --use-vision — same goal, navigator
with screenshots. Use only when the user is comfortable sending
screenshots/page context for this workflow. Vision is a retry mode
for the same task; keep the granule.magicbrowse observe + primitives —
click <target-id>, type <target-id> <text>,
fill <target-id> <value>, select <target-id> <option-text>,
press <keys>. Use only when vision-mode act cannot make
progress, or when single-element precision is required. A primitive
completed result means the direct action ran; it is not a semantic
proof that the intended page state changed. Re-run observe before the
next decision. press is global — click first if focus matters.For public navigation tasks, give act the semantic goal and a checkable
terminal condition:
✓ magicbrowse act "navigate to the public page that lists supported regions and stop when the region list is visible"
Avoid manually replaying snapshot ids before act has failed:
✗ magicbrowse observe → magicbrowse click 13 → magicbrowse observe → magicbrowse click 23
magicbrowse doctor first on a fresh install. It verifies the
gateway config and reachability.magicbrowse init <apiKey> (sign up at
https://agents.mercuryo.io/signup).launch and act once doctor passes.Consequential actions require approval.
magicbrowsemay navigate, inspect, draft, and prepare. It must stop and ask before submitting a form, posting or sending content, accepting terms, changing account data or settings, booking, buying, ordering, deleting or modifying remote data, or otherwise committing an irreversible or account-affecting action. After approval, re-runobserveand execute only the approved final action. A successful typed MagicPay approval counts for that exact payment, signing, or confirmation action; ask again only if the approved page facts changed.
Memory-managed data — never invent. Do not use
act,type,fill, orselectfor any of the following on any page:
- login or signup credentials (email, username, password, OTP),
- identity-document fields (passport, ID, KYC address, DOB tied to identity),
- payment-card or banking fields (PAN, CVV, expiry, IBAN, account),
- any value sourced from a Memory or Memory store, or any value you do not legitimately have.
Reach the page, stop before entering Memory values, and return the handoff to the orchestrator or MagicPay Memory fill workflow. Do not guess, placeholder, or fabricate Memory values. Be honest about what you cannot do.
Use
actbefore snapshot primitives. Do not start MagicBrowse work withobserveplusclick/type/select/press/fillbefore attemptingacton the same goal. Why: the navigator keeps the goal, current page context, and completion check in one planner loop instead of spreading them across fragile snapshot ids. Use primitives only after DOM-only and vision-modeactcannot make progress, or when the recovery step is deliberately single-element.
Target-ids are snapshot-scoped. Valid only for the
observesnapshot that produced them. Re-runobserveafter any primitive that may change the page state before the next primitive — reusing an old id silently addresses a different element.✓
observe→click 12→observe→type 7 "hello"✗observe→click 12→type 7 "hello"
Primitive completion is not goal completion. For deterministic
click/type/fill/select/press,status: "completed"means the browser action was dispatched through the direct action layer. It does not certify that a higher-level page condition is now true. If the next step depends on changed page state, observe again and branch on the fresh page state. If the task itself needs a completion check, useactwith a checkable terminal condition instead of interpreting a primitive result as task success.
One workflow per default home. The current-session pointer at
$MAGICBROWSE_HOME/current-session.json(default~/.magicbrowse/) is a singleton. Concurrent workflows on the same home overwrite each other. For parallel use, set a distinctMAGICBROWSE_HOMEper workflow, or do not run the tasks in parallel.
Fresh browser by default. Prefer an owned, fresh browser session. Use
attach,--profile, or--user-data-dironly when the user explicitly approves that browser/session for the current task. One exception needs no separate approval: attaching to the browser child that MagicPay launched inside the current approved product workflow — that is the normal in-workflow bind of an owned disposable browser. Keep CDP endpoints private. Close the session before unrelated work.
Page context can leave the browser. LLM-backed
actsends page state to the gateway;--use-visioncan include screenshots. Avoid private pages unless the user approves that workflow, and stop at login, identity, checkout, donation, subscription, or payment pages.
Contract: launch [url] → act … act → close. Sequential act calls in
one session preserve page state and planner memory.
magicbrowse launch <url> — start a headless owned Chrome session
pre-placed at the entry URL. Keep browser launches headless unless
the user explicitly asks for a visible browser or you are doing live
debugging. To attach to an existing CDP browser instead, first get
explicit user approval for that endpoint/session:
magicbrowse attach <cdp-url-or-ws-endpoint> (positional, not a
--cdp-url flag).magicbrowse act "<goal>" — natural-language browser step. Prompt is
positional. act does not take --url; you cannot reset
the page from inside act. To re-anchor, close and launch again.act for the next strategic granule.magicbrowse close — release the session when the overall
MagicBrowse-owned browser task is done. If the workflow hands off to
another tool or the user on a sensitive page, keep the browser open until
that handoff completes. After the handoff completes, close only a
MagicBrowse-owned disposable browser that the user is not taking over; do
not close an external/user-owned attach without explicit approval.magicbrowse run exists in the CLI for one-shot developer use. It
is not part of this skill contract — its bundled close destroys
continuity. Do not use it in an orchestrated workflow.
act where the
orchestrator needs the next strategic decision. Tactics (which form
field first) live inside act; strategy (this partner is wrong, try
another) lives between act calls.act; smaller is
safer. maxSteps: 100 is a safety ceiling. The planner
self-validates terminal status, so longer tasks have more room for
false-positive completion. Prefer smaller granules when the success
criterion cannot be checked externally.status: needs_handoff, not failed. Plan tasks to end at such
a wall, not through it. magicbrowse does not solve CAPTCHA and
does not enter credentials. For a confirmed real CAPTCHA on the current
approved browser session, have the user or an external solver clear it;
after a successful solve, run magicbrowse mark-captcha-resolved before
the next act. Branch on handoff.kind: captcha means solve/mark,
auth means stop for user authentication, identity_verification means
stop for user/KYC handling, and memory_fill means hand off to the
MagicPay Memory fill workflow. Memory-fill handoffs include
resumeObjective; after the approved handler fills the form, continue
with that page-local objective. Never retry the same act against the
same wall. If the page asks for something you cannot legitimately
provide, be honest about it.act
calls in one session preserve page state and planner memory. Do not
write "as we already found, continue with…" into goals — if you
feel the need to, the granularity is wrong.act "click target 14"act "click the 'Continue' button under the price summary"act "get to checkout"act "navigate to a checkout page that shows passenger fields and total fare"launch, not as a separate step. To
switch sites mid-workflow, either close and re-launch, or
describe the navigation inside the goal text.
- Element indexes (
[14],target 7) in goal text.magicbrowse runfor orchestrated multi-step workflows.type/fill/select/acton Memory-managed fields. Stop at the form boundary; ifactreturns a memory-fill handoff, send it to the orchestrator or MagicPay Memory fill workflow and then resume withhandoff.resumeObjective.- Letting
actsubmit, post, book, buy, save, delete, or otherwise commit an account-affecting action without explicit approval or a matching typed MagicPay approval for unchanged page facts.- Trying to solve CAPTCHA through
magicbrowse. On a confirmed real CAPTCHA, have the user or an external solver clear it, thenmagicbrowse mark-captcha-resolvedbefore the next MagicBrowse step.- Attaching to a logged-in browser or named profile without explicit approval for the current task.
- Closing a browser that was handed to another tool or the user before the overall task is actually done.
- Re-narrating prior
actresults into the next goal — sequentialactcalls keep state.- Skipping the
act-first path and starting at layer 4 (observe + primitives).- Reusing a target-id from before a page mutation.
- Treating a deterministic primitive's
completedstatus as proof that the intended page state changed.
act returns status: completed | blocked | needs_handoff | needs_approval | failed | max_steps | cancelled. Branch on status;
do not parse finalMessage to detect missing input, Memory
handoff, handoff subtype, or approval stops. For blocked, branch on
blockedReason: missing_input | item_unavailable | ambiguous | no_path.
For needs_handoff, branch on
handoff.kind: memory_fill | captcha | auth | identity_verification.
Layer-4 primitives return direct action results. Branch on their status
and reason, but verify page-state assumptions with a fresh observe;
primitive completed is not a substitute for a goal-level completion check.
finalMessage is the explanation to show the user or pass upstream.
Memory-fill handoff details are in handoff.resumeObjective. Exit
code 0 includes blocked, needs_handoff, and needs_approval; it
does not mean success.
See references/statuses.md.