Web Scraper as a Service

PassAudited by ClawScan on May 1, 2026.

Overview

This instruction-only skill coherently helps build web scrapers, but users should supervise live scraping, generated code execution, and anti-scraping-related techniques.

Before installing, be comfortable with an agent that can create and run scraper code and fetch websites. Confirm that each target permits scraping, respect robots.txt and terms of service, avoid collecting personal data unless clearly authorized, and review generated scripts and sample output before client delivery.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

The agent may run local Python scraping scripts and write output files, so mistakes in generated code could affect the working directory or produce unintended requests.

Why it was flagged

The skill authorizes the agent to write and run scraper code locally. This is central to the stated purpose, but it means generated code may execute and create or modify project files.

Skill content
allowed-tools: Read, Write, Edit, Grep, Glob, Bash, WebFetch, WebSearch ... Generates the scraper, runs it, cleans the data, and packages everything for the client.
Recommendation

Review generated scraper code before running it, keep work in a dedicated project directory, and confirm the requested scrape scope before execution.

What this means

A user could unintentionally violate a site's terms, robots.txt, or rate limits if these techniques are used to bypass restrictions rather than to scrape authorized content responsibly.

Why it was flagged

The skill includes scraping techniques that can be legitimate for compatibility and diagnostics but can also be misused to evade site controls if not bounded by the ethical rules later in the artifact.

Skill content
What anti-scraping measures are visible? (Cloudflare, CAPTCHAs, rate limits) ... User-Agent rotation ... at least 5 user agents
Recommendation

Use only on sites you are authorized to scrape, do not bypass CAPTCHAs or access controls, respect robots.txt and ToS, and use honest identification and conservative rate limits.