Scraper

v1.0.0

Structured extraction and cleanup for public, user-authorized web pages. Use when the user wants to collect, clean, summarize, or transform content from acce...

0· 542·8 current·8 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included scripts: fetching pages, extracting text, saving outputs locally. No unrelated credentials, binaries, or installs are requested.
Instruction Scope
SKILL.md and scripts restrict work to public/user-authorized pages and local-only storage. However, there is no runtime enforcement of those rules: the scripts will fetch any URL provided (including internal IPs/localhost), and there is no robots/paywall/captcha checking, rate limiting, or URL validation. That is expected for a small helper but is a security consideration rather than an incoherence.
Install Mechanism
No install spec and no remote downloads; the skill is instruction-only with bundled Python scripts, which minimizes install risk.
Credentials
The skill requires no environment variables or credentials and only writes under ~/.openclaw/workspace/memory/scraper, consistent with the declared purpose.
Persistence & Privilege
The skill is not always-enabled and can be invoked by the user. It does create persistent local state (jobs.json and output files) under the user's home — this is coherent but users should be aware of stored files and cleanup policy.
Assessment
This skill appears to do what it says: fetch public pages, extract text, and save results locally. Before installing or enabling it for autonomous use, consider: (1) the scripts will fetch any URL you or the agent give them — add URL validation or an allowlist if you need to block internal/IP ranges (SSRF risk); (2) there is no enforcement of 'public/user-authorized' rules — rely on agent policies or operator oversight to prevent misuse (paywall/login bypass, private endpoints); (3) outputs are stored at ~/.openclaw/workspace/memory/scraper — check and clean that directory if sensitive data might be saved. If you only plan manual, user-initiated runs and trust the callers, the skill is coherent and appropriate.

Like a lobster shell, security has layers — review code before you run it.

latestvk9758en37e7b6msrwnts0kgt9n82revw

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments