Skillv1.0.7

ClawScan security

Captcha Auto · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousFeb 25, 2026, 7:55 AM

Verdict: suspicious
Confidence: medium
Model: gpt-5-mini
Summary: The skill's behavior (local Tesseract OCR + fallback to a third‑party visual model that receives full-page screenshots) matches its description, but there are inconsistencies in the declared metadata and clear privacy risks (screenshots sent to an external API) that you should understand before installing.
Guidance: This skill appears to implement what it claims (local OCR then a remote visual model fallback) but it will take full-page screenshots and send them to a third-party API (Aliyun DashScope-compatible endpoint). Before installing: 1) Be aware screenshots may include sensitive info — do not run on pages with passwords, payment data, or personal info. 2) The skill requires a VISION_API_KEY (contradicting the registry metadata) or config in ~/.openclaw/openclaw.json — verify what keys you store there. 3) Review the index.mjs code yourself or run the skill in a sandboxed environment to confirm behavior. 4) If you must use it, consider creating a dedicated, limited-scope API key on the provider side and avoid running it against sensitive sites. If you need higher assurance, decline installation until the metadata is corrected and the developer documents exact data flows and config-file parsing behavior.

Review Dimensions

Purpose & Capability: concernThe skill's declared purpose (captcha recognition using local OCR with a visual-model fallback) matches the code and SKILL.md. However the registry metadata claimed 'Required env vars: none' while both SKILL.md and index.mjs require a VISION_API_KEY (or config file) to call the remote visual model. That metadata mismatch is misleading and reduces transparency about required secrets.
Instruction Scope: concernRuntime instructions and source direct the agent to take full-page screenshots of target webpages and send the image (base64) to an external endpoint (Aliyun DashScope-compatible API). The README warns not to use pages with passwords/bank info, but automatic full-page screenshotting means sensitive data can be captured and transmitted. The skill also reads the user's OpenClaw config (~/.openclaw/openclaw.json) for provider API keys.
Install Mechanism: okThis is instruction+code (no installer). Dependencies are standard Node modules (playwright-core, tesseract.js) and installation is via npm—no unusual download URLs or archive extraction. No install spec (low platform-level risk) but the package expects node >=18 and Chrome/Chromium present.
Credentials: concernThe skill legitimately needs an API key for the remote vision model (VISION_API_KEY / QWEN_API_KEY) and an API base URL. That is proportionate to the fallback capability. However manifest/registry metadata incorrectly lists no required env vars, and the code will also read ~/.openclaw/openclaw.json for provider credentials (potentially exposing other stored provider keys if present). The skill does not request unrelated credentials, but it will read user config files that may contain secrets.
Persistence & Privilege: okThe skill does not request always:true, does not attempt to modify other skills, and is user-invocable only. It writes screenshots to the workspace and expects the runtime to have node and Chrome, which is normal for browser automation.