Crawlee

Security checks across static analysis, malware telemetry, and agentic risk

Overview

This is a benign instruction-only Crawlee guide, but it documents scraping automation, package installation, and persistent crawler storage that users should configure responsibly.

Install or use this skill if you intend to build Crawlee-based scrapers. Before running generated crawler code, confirm you are authorized to scrape the target site, set clear crawl limits, use trusted and preferably pinned dependencies, and protect any stored datasets, proxy credentials, cookies, or session state.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

The skill may help build crawlers that visit many pages or work around blocking mechanisms, so misuse could create legal, terms-of-service, or operational issues for website owners.

Why it was flagged

The skill explicitly covers scraping automation, proxy/session handling, and anti-bot topics. These capabilities are central to Crawlee guidance, but they can be misused against websites if not scoped and authorized.

Skill content
automate browser navigation, handle anti-bot blocking, manage proxies or sessions for scraping ... "bypass bot detection"
Recommendation

Use it only for authorized scraping, respect robots.txt and site terms where applicable, and set limits on domains, request counts, concurrency, and data collection.

What this means

Running the setup commands will install third-party dependencies and, for Playwright, browser binaries on the user's machine.

Why it was flagged

The skill provides user-directed setup commands that install packages and browser dependencies from external package ecosystems. This is expected for a Crawlee guide, but it still depends on trusted package sources.

Skill content
npx crawlee create my-crawler ... npm install crawlee playwright ... pip install 'crawlee[playwright]' ... playwright install
Recommendation

Install from trusted registries, review generated projects before running them, and consider pinning package versions in production projects.

What this means

Crawler output, cookies, and session state may remain on disk and could expose sensitive scraped content or authenticated session details if users crawl logged-in pages.

Why it was flagged

The Crawlee reference documents persistent crawler state, storage, and cookie/session handling. This is purpose-aligned for crawling, but it can retain scraped data or session information across requests or runs.

Skill content
persistCookiesPerSession: true ... storageDir: './storage' ... persistStateIntervalMillis: 60_000
Recommendation

Store crawler data in a controlled directory, avoid saving unnecessary cookies or sensitive page data, and purge datasets/session stores when no longer needed.