Nmb Scrapling
Security checks across malware telemetry and agentic risk
Overview
This skill is a web-scraping helper, but it explicitly supports bypassing anti-bot protections and running large crawls, so it should be reviewed carefully before use.
Install only if you have a legitimate, authorized scraping need. Avoid using the stealth, Cloudflare-bypass, proxy, or large-crawl features against sites where you lack permission. Verify the external Scrapling package before installing, pin trusted versions if possible, and enable MCP/session persistence only with clear approval and cleanup practices.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The agent could help scrape protected websites in ways that violate site rules, create legal/compliance risk, or cause unwanted load on third-party services.
The skill explicitly teaches the agent to bypass third-party anti-bot controls and scale crawling activity, not just fetch user-authorized public pages.
Bypass Cloudflare/anti-bot protection ... Build large-scale crawlers ... solve_cloudflare=True
Use only on sites where you have authorization, require explicit user approval before stealth/proxy/Cloudflare-bypass use, and set clear target, rate-limit, and robots.txt/terms-of-service boundaries.
Installing the external package could pull code and browser components that are outside the provided artifact review.
The setup relies on user-directed, unpinned external package and browser installation commands; the registry also lists no source or homepage, so provenance should be verified.
pip install "scrapling[all]" scrapling install
Verify the Scrapling package source, pin trusted versions where possible, and review what `scrapling install` downloads before running it.
If used after logging in, the agent may fetch pages or data that are tied to a user account.
The documentation shows scraping while reusing cookies in an authenticated session, which can access account-bound data.
保持会话(cookie复用) ... page2 = session.fetch('https://example.com/dashboard') # 已登录状态Only use authenticated scraping with accounts you control and for data you are allowed to access; avoid sharing session cookies with untrusted workflows.
An AI client configured with this MCP server may be able to initiate scraping tasks or pass scraped content through the agent workflow.
The skill documents exposing Scrapling through an MCP server so other AI clients can call it, but the artifacts do not describe permission or approval boundaries for those calls.
让Claude/Cursor直接调Scrapling爬数据 ... "command": "scrapling", "args": ["mcp"]
Enable the MCP server only in trusted clients and configure approval, target restrictions, and logging before allowing agent-driven scraping.
Crawls may resume from prior saved data and continue contacting target sites beyond the user’s immediate session expectations.
The crawler can persist crawl state and resume later. This is disclosed and aligned with large-scale crawling, but it can continue work across runs if the user forgets the saved state.
MySpider(crawldir="./crawl_data").start() ... Ctrl+C 暂停,再次运行从断点继续
Use dedicated crawl directories, review saved state before restarting, and delete crawl data when a job is finished.
