Web Scraper

PassAudited by ClawScan on May 1, 2026.

Overview

The skill is coherently described as a web-scraping helper, with disclosed network, filesystem, script-generation, and optional LLM/API-key use, but users should notice those capabilities before installing.

This appears reasonable for a web-scraping skill, but install it only if you are comfortable with network access, generated local scripts/output files, and optional OpenRouter-based LLM processing. Review generated code before running bulk crawls, respect target-site rules and rate limits, and avoid using the LLM stage on sensitive private pages.

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

The agent may run local environment checks and generate scraping files when the user invokes the skill.

Why it was flagged

The skill directs local command use and script creation as part of the scraping workflow. This is expected for the stated purpose, but it gives the agent operational authority on the local environment.

Skill content
Before writing any scraping script or running any command... Survey the environment... pip list... npx playwright install --dry-run
Recommendation

Use it in a project directory you are comfortable modifying, and review generated scripts before running large crawls.

What this means

If optional LLM extraction is used, generated scripts may access the user's OpenRouter API key from the environment.

Why it was flagged

The skill may rely on an API key for optional entity extraction. This credential use is disclosed and purpose-aligned, and the artifact also says credential files should not be read directly.

Skill content
The optional Stage 5 (LLM entity extraction) requires an OPENROUTER_API_KEY environment variable — but only in the generated scripts
Recommendation

Provide only the intended API key, monitor provider usage/costs, and avoid putting credentials into generated files.

What this means

Content processed in optional LLM extraction may leave the local machine and be handled by an external provider.

Why it was flagged

Optional LLM entity extraction implies scraped page text may be sent to an external LLM provider through generated scripts. This is disclosed and aligned with the skill's purpose, but users should understand the data flow.

Skill content
[STAGE 5] Entity Extraction (LLM) — optional
Recommendation

Do not use the LLM stage on private, confidential, or access-controlled content unless the provider and policy are acceptable.

What this means

Users have less registry-level provenance information to verify who maintains the skill and where it came from.

Why it was flagged

The registry-level source and homepage are absent, while the included claw.json separately lists a GitHub homepage and binary requirements. This is a provenance/metadata consistency issue, not evidence of malicious behavior.

Skill content
Source: unknown; Homepage: none
Recommendation

Verify the publisher and repository before relying on the skill for important or large-scale scraping tasks.