Scrapling - Stealth Web Scraper

Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headl...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 635 · 6 current installs · 6 all-time installs

byDamir Armanov@Damirikys

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description (stealth web scraping, Cloudflare bypass, JS rendering) align with the included script and SKILL.md. The skill's instructions to install scrapling, a stealth Playwright fork (patchright), and Chromium are proportionate to the described functionality.

ℹ

Instruction Scope

Instructions are explicit about installing packages, downloading Chromium, and optionally starting an MCP local HTTP server. The skill also documents 'auto_save' which persists element fingerprints to disk. These behaviors are relevant to the stated purpose but do increase local persistence and expose a local network endpoint if MCP is started — the SKILL.md warns to only start MCP when explicitly needed.

ℹ

Install Mechanism

There is no registry install spec, but SKILL.md directs the user to run 'pip install scrapling[all]' and 'patchright install chromium'. Installing from PyPI and running a package-provided installer that downloads a browser binary is expected for this capability, but users should be aware that PyPI packages execute arbitrary install-time code and the Chromium installer fetches ~100MB of binaries from the package's installer.

✓

Credentials

The skill declares no required env vars, credentials, or config paths. The behaviour (session/cookie handling, optional local MCP server, disk persistence for fingerprints) is consistent with no additional secret access being requested.

ℹ

Persistence & Privilege

The skill does not request always:true or elevated platform privileges. However, optional features (MCP local HTTP server and auto_save fingerprints) create persistent local state and expose a local endpoint if used; the SKILL.md explicitly warns to start these only when trusted.

Assessment

This skill appears internally consistent for a stealth web scraper, but it carries the normal risks of such tools. Before installing: 1) Confirm you trust the scrapling and patchright PyPI packages / their maintainers (review their GitHub/PyPI pages and recent activity). 2) Only run stealth/dynamic modes on sites you are authorized to scrape — bypassing anti-bot protections can violate terms or laws. 3) Be cautious with the 'patchright install chromium' step (downloads binaries) and with enabling the MCP server (it opens a local HTTP service). 4) Run installs in an isolated environment (virtualenv or container) and inspect the installed package contents if you need higher assurance. If you want, provide the upstream GitHub/PyPI links and I can check them for suspicious patterns or supply commands to verify package integrity before installing.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.3

Download zip

latestvk971bkftbhp32nc3ts22s10bsx81v9j3

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Scrapling Skill

Source: https://github.com/D4Vinci/Scrapling (open source, MIT-like license) PyPI: scrapling — install before first use (see below)

⚠️ Only scrape sites you have permission to access. Respect robots.txt and Terms of Service. Do not use stealth modes to bypass paywalls or access restricted content without authorization.

Installation (one-time, confirm with user before running)

pip install scrapling[all]
patchright install chromium  # required for stealth/dynamic modes

scrapling[all] installs patchright (a stealth fork of Playwright, bundled as a PyPI package — not a typo), curl_cffi, MCP server deps, and IPython shell.
patchright install chromium downloads Chromium (~100 MB) via patchright's own installer (same mechanism as playwright install chromium).
Confirm with user before running — installs ~200 MB of dependencies and browser binaries.

Script

scripts/scrape.py — CLI wrapper for all three fetcher modes.

# Basic fetch (text output)
python3 ~/skills/scrapling/scripts/scrape.py <url> -q

# CSS selector extraction
python3 ~/skills/scrapling/scripts/scrape.py <url> --selector ".class" -q

# Stealth mode (Cloudflare bypass) — only on sites you're authorized to access
python3 ~/skills/scrapling/scripts/scrape.py <url> --mode stealth -q

# JSON output
python3 ~/skills/scrapling/scripts/scrape.py <url> --selector "h2" --json -q

Fetcher Modes

http (default) — Fast HTTP with browser TLS fingerprint spoofing. Most sites.
stealth — Headless Chrome with anti-detect. For Cloudflare/anti-bot.
dynamic — Full Playwright browser. For heavy JS SPAs.

When to Use Each Mode

web_fetch returns 403/429/Cloudflare challenge → use --mode stealth
Page content requires JS execution → use --mode dynamic
Regular site, just need text/data → use --mode http (default)

Python Inline Usage

For custom logic beyond the CLI, write inline Python. See references/patterns.md for:

Adaptive scraping (auto_save / adaptive — saves element fingerprints locally)
Session/cookie handling
Async usage
XPath, find_similar, attribute extraction

Notes

MCP server (scrapling mcp): starts a local network service for AI-native scraping. Only start if explicitly needed and trusted — it exposes a local HTTP server.
auto_save=True: persists element fingerprints to disk for adaptive re-scraping. Creates local state in working directory.
Stealth/dynamic modes use Chromium headless — no xvfb-run needed.
For large-scale crawls, use the Spider API (see Scrapling docs).

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…