Web Scraper
PassAudited by ClawScan on May 1, 2026.
Overview
The skill is coherently described as a web-scraping helper, with disclosed network, filesystem, script-generation, and optional LLM/API-key use, but users should notice those capabilities before installing.
This appears reasonable for a web-scraping skill, but install it only if you are comfortable with network access, generated local scripts/output files, and optional OpenRouter-based LLM processing. Review generated code before running bulk crawls, respect target-site rules and rate limits, and avoid using the LLM stage on sensitive private pages.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The agent may run local environment checks and generate scraping files when the user invokes the skill.
The skill directs local command use and script creation as part of the scraping workflow. This is expected for the stated purpose, but it gives the agent operational authority on the local environment.
Before writing any scraping script or running any command... Survey the environment... pip list... npx playwright install --dry-run
Use it in a project directory you are comfortable modifying, and review generated scripts before running large crawls.
If optional LLM extraction is used, generated scripts may access the user's OpenRouter API key from the environment.
The skill may rely on an API key for optional entity extraction. This credential use is disclosed and purpose-aligned, and the artifact also says credential files should not be read directly.
The optional Stage 5 (LLM entity extraction) requires an OPENROUTER_API_KEY environment variable — but only in the generated scripts
Provide only the intended API key, monitor provider usage/costs, and avoid putting credentials into generated files.
Content processed in optional LLM extraction may leave the local machine and be handled by an external provider.
Optional LLM entity extraction implies scraped page text may be sent to an external LLM provider through generated scripts. This is disclosed and aligned with the skill's purpose, but users should understand the data flow.
[STAGE 5] Entity Extraction (LLM) — optional
Do not use the LLM stage on private, confidential, or access-controlled content unless the provider and policy are acceptable.
Users have less registry-level provenance information to verify who maintains the skill and where it came from.
The registry-level source and homepage are absent, while the included claw.json separately lists a GitHub homepage and binary requirements. This is a provenance/metadata consistency issue, not evidence of malicious behavior.
Source: unknown; Homepage: none
Verify the publisher and repository before relying on the skill for important or large-scale scraping tasks.
