Crawlee
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This is a benign instruction-only Crawlee guide, but it documents scraping automation, package installation, and persistent crawler storage that users should configure responsibly.
Install or use this skill if you intend to build Crawlee-based scrapers. Before running generated crawler code, confirm you are authorized to scrape the target site, set clear crawl limits, use trusted and preferably pinned dependencies, and protect any stored datasets, proxy credentials, cookies, or session state.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The skill may help build crawlers that visit many pages or work around blocking mechanisms, so misuse could create legal, terms-of-service, or operational issues for website owners.
The skill explicitly covers scraping automation, proxy/session handling, and anti-bot topics. These capabilities are central to Crawlee guidance, but they can be misused against websites if not scoped and authorized.
automate browser navigation, handle anti-bot blocking, manage proxies or sessions for scraping ... "bypass bot detection"
Use it only for authorized scraping, respect robots.txt and site terms where applicable, and set limits on domains, request counts, concurrency, and data collection.
Running the setup commands will install third-party dependencies and, for Playwright, browser binaries on the user's machine.
The skill provides user-directed setup commands that install packages and browser dependencies from external package ecosystems. This is expected for a Crawlee guide, but it still depends on trusted package sources.
npx crawlee create my-crawler ... npm install crawlee playwright ... pip install 'crawlee[playwright]' ... playwright install
Install from trusted registries, review generated projects before running them, and consider pinning package versions in production projects.
Crawler output, cookies, and session state may remain on disk and could expose sensitive scraped content or authenticated session details if users crawl logged-in pages.
The Crawlee reference documents persistent crawler state, storage, and cookie/session handling. This is purpose-aligned for crawling, but it can retain scraped data or session information across requests or runs.
persistCookiesPerSession: true ... storageDir: './storage' ... persistStateIntervalMillis: 60_000
Store crawler data in a controlled directory, avoid saving unnecessary cookies or sensitive page data, and purge datasets/session stores when no longer needed.
