Skrape

v1.1.1

Ethical web data extraction with robots exclusion protocol adherence, throttled scraping requests, and privacy-compliant handling ("Scrape responsibly!").

0· 198·0 current·0 all-time
byX@10oss
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the contents: SKILL.md and code.md focus on robots.txt checks, throttling/backoff, and privacy guidance. There are no unrelated env vars, binaries, or opaque network endpoints requested.
Instruction Scope
Runtime instructions stay within scraping responsibilities (check robots, prefer APIs, throttle, avoid PII). The example code is illustrative and consistent, but contains implementation simplifications (e.g., treating missing robots.txt and fetch errors as 'permitted', a basic robots.txt evaluator that may not fully implement precedence/longest-match rules). These are functional caveats rather than malicious behavior.
Install Mechanism
This is instruction-only with no install spec and no external downloads — lowest surface area. The code examples only use Node built-ins (http/https/url/console/process).
Credentials
No credentials, env vars, or config paths are requested. The sample uses a contact email in the User-Agent, which is appropriate for polite scraping but not a secret.
Persistence & Privilege
always is false and the skill does not request persistent or cross-skill privileges. It does not modify system or other-skill configs.
Assessment
This appears to be a coherent, instruction-only scraper helper. Before you use it in production: (1) Treat code.md as example patterns, not a drop-in library — the workflow references require('./scrape') which isn't provided and robots parsing is simplified. (2) Replace the example contact email with a real contact or remove it as appropriate. (3) Consider tightening robots handling (be conservative on errors instead of assuming permission) and improve robots.txt parsing for complex rules. (4) Never feed or persist sensitive credentials or personal data unless you have lawful basis; the SKILL.md already warns about PII/GDPR. (5) If you incorporate the example into your system, review the code for correctness, add limits and audit controls, and run it in a controlled environment. Overall, there are no red flags that contradict the stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk97fbsqetqhqq99p435a6b35v582yd88latest, web-scraping, robots.txt, rate-limiting, nodejs, javascript, crawler, data-extraction, ethical-scraping, gdpr, http-clientvk97etxvd0grhzrsb8c05gc8zrx82yk5m

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments