Web Scraper

PassAudited by VirusTotal on May 8, 2026.

Overview

Type: OpenClaw Skill Name: python-web-scraper Version: 1.0.1 The skill bundle is a standard web scraping toolkit providing scripts for basic, paginated, and Selenium-based data extraction. The code in scripts/scrape-basic.py, scripts/scrape-pagination.py, and scripts/scrape-with-selenium.py follows best practices for scraping, including rate limiting and user-agent rotation, and lacks any indicators of data exfiltration, unauthorized execution, or malicious prompt injection.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Using these instructions could help an agent evade website bot protections, violate terms of service, trigger IP/account blocks, or create legal and operational risk.

Why it was flagged

This provides explicit anti-detection guidance for hiding Selenium automation, going beyond ordinary rate limiting or polite scraping.

Skill content
## Chrome DevTools Protocol (CDP) Tricks (Selenium)
# Bypass webdriver detection
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", { ... Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) ... })
Recommendation

Use the scraper only on sites you own or are authorized to test, and remove or avoid anti-detection, CAPTCHA-solving, and proxy-evasion workflows unless there is explicit permission.

What this means

An agent could be guided to use private login sessions or account credentials to access content that the user did not intend to expose or automate.

Why it was flagged

The skill gives instructions for using account credentials and browser/session cookies for scraping, but does not clearly scope which accounts, permissions, storage, or outputs are safe.

Skill content
### Handle login-protected pages
# Option 1: Export cookies from browser
# In browser console: document.cookie ...
s.post('https://example.com/login', data={'user': '...', 'pass': '...'})
with open('cookies.txt', 'w') as f: f.write(str(s.cookies.get_dict()))
Recommendation

Do not provide browser cookies or credentials unless the target site and account use are explicitly authorized; avoid storing session cookies in plaintext files.

What this means

The contradictory framing may make users or agents underestimate the risk of using authenticated sessions for scraping.

Why it was flagged

The skill simultaneously provides instructions for handling login-protected pages and later says never to scrape login-protected content, creating conflicting guidance about safe use.

Skill content
### Handle login-protected pages ... Export cookies from browser ...
...
## Ethics & Legal
- Never scrape login-protected content, personal data, or copyrighted material
Recommendation

Clarify that authenticated scraping should only occur with explicit authorization, and remove examples that conflict with the stated ethical boundary.

What this means

Running the Selenium script may fetch third-party browser-driver components in addition to the Python packages.

Why it was flagged

The Selenium helper downloads or locates a ChromeDriver through webdriver-manager at runtime; this is expected for the feature but introduces dependency/provenance risk.

Skill content
from webdriver_manager.chrome import ChromeDriverManager
...
service = Service(ChromeDriverManager().install())
Recommendation

Pin package versions where possible, review webdriver-manager behavior, and use trusted package repositories or preinstalled drivers in controlled environments.