Scrapling - Stealth Web Scraper
WarnAudited by ClawScan on May 10, 2026.
Overview
The skill is transparent about scraping, but it is designed to use stealth browser automation to bypass anti-bot protections, so it needs careful, permission-limited use.
Review before installing. Use this skill only for sites you own or have explicit permission to scrape, especially before using stealth mode. Install dependencies in an isolated environment, avoid embedding real credentials in scripts, and do not start the MCP server unless you trust the local environment.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The agent could scrape sites in ways that violate Terms of Service, bypass intended bot controls, or get the user's IP/account blocked.
The skill explicitly instructs use of stealth scraping when sites block normal fetching, which can evade bot-protection controls on arbitrary targets.
"anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing)" ... "web_fetch returns 403/429/Cloudflare challenge → use `--mode stealth`"
Use stealth mode only for sites you own or have explicit permission to scrape. Require explicit per-domain approval and prefer normal HTTP fetching whenever possible.
Installing the skill can add a large browser runtime and third-party Python packages to the user's environment.
The setup installs unpinned PyPI dependencies and downloads a Chromium binary through patchright; this is disclosed and purpose-aligned, but it requires trusting those upstream packages.
pip install scrapling[all] patchright install chromium # required for stealth/dynamic modes
Install in a virtual environment, pin versions where possible, and verify the Scrapling/patchright sources before installation.
If used with real credentials, the agent may access account-specific pages and data.
The reference patterns show authenticated session/cookie use. There is no evidence of credential leakage, but using account credentials changes the permission boundary.
session.get('https://example.com/login', data={'user': 'x', 'pass': 'y'}) ... session.get('https://example.com/dashboard')Only provide credentials for accounts and sites you are authorized to scrape, and avoid storing credentials in reusable scripts or logs.
If started in an untrusted or exposed environment, other local agents or processes might be able to use the scraping service.
The optional MCP mode exposes scraping as a local HTTP/tool service; the artifacts warn about trust but do not define authentication or network binding details.
The MCP server starts a local HTTP service. Only use in trusted environments. ... Add to OpenClaw MCP config
Start the MCP server only when needed, keep it bound to localhost, use it in trusted environments, and stop it after use.
Stored fingerprints could reveal scraping targets or cause future runs to rely on stale or manipulated page structure data.
Adaptive scraping can create local persistent state that may influence future scraping behavior.
`auto_save=True`: persists element fingerprints to disk for adaptive re-scraping. Creates local state in working directory.
Know where adaptive state is stored, keep it project-scoped, and delete it when it is no longer needed.
