Back to skill
Skillv1.0.0

ClawScan security

智能网页爬虫 · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

ReviewApr 11, 2026, 7:39 AM
Verdict
Review
Confidence
medium
Model
gpt-5-mini
Summary
The package is a straightforward Node.js web scraper, but the SKILL.md/metadata overpromises features (proxy pool, retries, DB storage, random delay) that are not implemented or not wired up in the code, so the skill's documentation and capabilities are inconsistent.
Guidance
This skill contains plausible scraper code (Puppeteer + Cheerio) and will npm install Puppeteer (which downloads Chromium). However the README/metadata overstate capabilities — proxy pools, retries, database writes and randomized anti-bot strategies are advertised but not implemented. Before installing or using: (1) review scraper.js yourself or run it in a sandboxed environment; (2) avoid running npm install as root because Puppeteer/Chromium can require special flags (--no-sandbox is used in the code); (3) if you need proxy or DB features, expect to modify the code and add secure credential handling; (4) heed legal/robots.txt constraints for scraping targets. If you want a fully-featured scraper, request clarification or a version that actually implements the advertised features and documents how credentials/config are provided.

Review Dimensions

Purpose & Capability
concernName/description promise: auto-recognition, anti-bot adaptations, proxy pool support, automatic retries, and database direct storage. The code implements basic Puppeteer fetching, Cheerio parsing, simple file export, and a static random User-Agent list. It does NOT implement proxy pool usage, DB storage, retry logic, or true randomized delays despite these appearing in the documentation—this is a mismatch between stated purpose and actual capability.
Instruction Scope
concernSKILL.md instructs npm install and running scraper.js (consistent). However the documentation advertises features (IP proxy pool, DB direct store, configurable randomized delays/retries) that the runtime instructions/code do not actually support. The runtime code reads a local config file and writes outputs to local files (JSON/CSV/Excel) only — it does not access external endpoints other than the target URLs, nor does it read environment variables or other system config.
Install Mechanism
noteNo explicit install spec in registry (instruction-only), but package.json depends on puppeteer (which will download Chromium during npm install). This is expected for a scraper but increases install size and can pull large binaries. No external, untrusted download URLs; standard npm dependencies are used.
Credentials
concernRequires no environment variables or credentials in metadata, which matches the code. However the documentation claims proxy pool and DB direct-storage features that typically require credentials/config; those are not requested or implemented—this mismatch can mislead users about what secrets/config are needed and may result in attempts to add credentials later without clear handling in the code.
Persistence & Privilege
okDoes not request persistent/always-on privilege. It is user-invocable and not set to always: true. The skill only runs when invoked and writes output files to disk, which is expected behavior for a CLI scraper.