Crawlee Web Scraper
v1.0.0Resilient web scraper with bot-detection evasion using the Crawlee library. Use when web_fetch is blocked by rate limits or bot detection. Supports single UR...
⭐ 0· 106·2 current·2 all-time
byBryan Tegomoh, MD, MPH@bryantegomoh
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (Crawlee-based scraper) matches the delivered artifacts: two Python scripts that use requests and Crawlee to fetch pages and a SKILL.md describing exactly that. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md and the scripts are specific and scoped: they document usage, install (pip install crawlee requests), and show that fetching is targeted at user-supplied URLs. The code only reads a provided URLs file, runs a subprocess to call the included script, and returns JSON. There are no instructions to read unrelated system files, environment variables, or to transmit data to unexpected remote endpoints.
Install Mechanism
No install spec beyond the SKILL.md recommendation 'pip install crawlee requests'. Using pip is expected for a Python library, but installing Crawlee may pull additional runtime deps (Playwright/browser components) which can download browser binaries at install or first-run time. This is typical for headless-browser scrapers but may have additional network/activity implications.
Credentials
The skill declares no required environment variables or credentials and the code does not read secrets or unrelated env vars. All requests are to user-provided target URLs, which is proportionate to a scraping tool.
Persistence & Privilege
Skill does not request always: true and is user-invocable. It does not modify other skills or system-wide agent settings. Autonomous invocation is allowed by default but not combined with other red flags.
Assessment
This skill appears to be what it says: a Crawlee-based fallback scraper. Before installing, be aware: (1) it requires 'pip install crawlee requests' — Crawlee may install or later download browser tooling (Playwright or similar) which can add network activity and disk artifacts; (2) the scripts will perform HTTP requests to any URL you provide (so don’t give it URLs containing secrets, credentials, or private tokens); (3) scraping sites may violate terms of service or legal rules—use responsibly; (4) the fallback uses a subprocess with a 30s timeout and caps extracted text (10k chars) — adjust if you need longer fetches. If you need stricter controls, run this in an isolated environment and audit installed Python packages (or pin package versions) before use.Like a lobster shell, security has layers — review code before you run it.
latestvk97555hfe90wskh0x74dc12m7x83a3zy
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
