Scrape
PassAudited by VirusTotal on May 12, 2026.
Overview
Type: OpenClaw Skill Name: scrape Version: 1.0.0 The skill bundle is designed for ethical and legal web scraping, providing comprehensive guidelines in `SKILL.md` for compliance (robots.txt, ToS, GDPR/CCPA, rate limiting) and implementing these safeguards in `code.md`. The code uses standard Python libraries to perform web requests, handle rate limits, and log activity, all aligned with the stated purpose. There is no evidence of malicious intent, data exfiltration to unauthorized parties, persistence mechanisms, or prompt injection attempts designed to subvert the agent's purpose; instead, the instructions actively promote responsible behavior.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The agent could help create scraping code that sends requests to websites, and may proceed when robots.txt cannot be verified unless the user tightens the logic.
The example can fetch arbitrary user-supplied URLs and treats any robots.txt read exception as allowed. This is central to a scraping skill, but users should keep target selection and compliance checks under explicit control.
except Exception:
return True # No robots.txt = allowed
...
response = session.get(url)Only scrape public, authorized targets; prefer official APIs; respect terms and robots.txt; and consider failing closed or asking the user when robots.txt cannot be retrieved.
Scraping logs may retain sensitive URL details longer than intended.
The example logs scrape URLs and statuses for an audit trail. This is purpose-aligned, but URLs can contain query strings, identifiers, or other sensitive data if users scrape poorly scoped targets.
logger.info(f"SCRAPE url={url} status={response.status_code}")Avoid placing personal data in URLs, redact query strings when logging, and set clear retention/deletion rules for scrape audit logs.
