Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Data Scraper

v1.0.0

Extract data from websites and APIs for analysis. Use when user needs to collect product prices from e-commerce sites, gather news articles, extract structur...

0· 32·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for dinghaibin/scraper-pro.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Data Scraper" (dinghaibin/scraper-pro) from ClawHub.
Skill page: https://clawhub.ai/dinghaibin/scraper-pro
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install scraper-pro

ClawHub CLI

Package manager switcher

npx clawhub@latest install scraper-pro
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The name/description promise features (CSS/XPath selectors, pagination types including click, authentication/login support) that are not implemented in the included script. The SKILL.md shows a --login option and complex YAML pagination examples, but scripts/scrape.py does not accept a --login argument, does not implement click-based pagination, and does not implement XPath or robust selector parsing. This mismatch is incoherent and may mislead users about capabilities.
!
Instruction Scope
Runtime instructions tell the agent to run the bundled Python script with user-supplied URLs and output paths. The script performs network fetches and writes files but does not read other system files or environment variables. However, the code disables TLS certificate validation (ssl.CERT_NONE and check_hostname=False) when fetching pages, which weakens transport security and can enable MITM attacks; the SKILL.md and 'Best Practices' recommend checking robots.txt and respecting rate limits but do not disclose the TLS bypass. The documentation also references command-line options (--login) that are absent from the code.
Install Mechanism
No install spec; instruction-only with a small included Python script. Nothing is downloaded or written by an installer, which limits the installation attack surface.
Credentials
The skill requests no environment variables, credentials, or config paths. That is proportionate for a generic scraper. There is no evidence the code attempts to access other secrets or unrelated system configuration.
Persistence & Privilege
The skill is not always-enabled and does not request persistent system-wide privileges or modify other skills. It only runs when invoked and writes output files specified by the user.
What to consider before installing
This skill contains an executable Python scraper but the documentation overstates its capabilities (login, click-style pagination, XPath) which the script does not implement — treat those docs as inaccurate. Before using: (1) inspect or run the script in a safe sandbox; (2) do not pass credentials or sensitive file paths to the tool (it will write files to paths you specify); (3) fix or remove the TLS verification bypass in fetch_page (re-enable certificate checks) unless you understand and accept the risk; (4) test scraping on non-sensitive, permitted sites and confirm legal/robots.txt compliance; (5) if you need authentication/pagination/XPath support, either extend the script yourself or obtain a tool that explicitly implements and documents those features.

Like a lobster shell, security has layers — review code before you run it.

latestvk97byshetkc1e3ssz321byxwdx85nsx0
32downloads
0stars
1versions
Updated 11h ago
v1.0.0
MIT-0

Data Scraper

Extract structured data from websites and APIs.

Quick Start

# Basic page scrape
python scripts/scrape.py https://example.com --output data.json

Core Features

  • CSS/XPath selectors: Target specific elements
  • Multiple output formats: JSON, CSV, Markdown
  • Pagination support: Scrape multiple pages
  • Rate limiting: Respect server limits
  • Authentication: Handle login/sessions

Usage

python scripts/scrape.py [OPTIONS]

Options:
  --url TEXT          URL to scrape (required)
  --selector TEXT     CSS selector for data extraction
  --output PATH       Output file path
  --format FORMAT     Output format: json, csv, markdown
  --limit NUM         Maximum items to scrape
  --wait SECS         Wait between requests
  --login URL         Login URL for authenticated scraping

Examples

Product Price Collection

python scripts/scrape.py \
  --url "https://example.com/products" \
  --selector ".product" \
  --output prices.json \
  --format json

News Article Aggregation

python scripts/scrape.py \
  --url "https://news.example.com/latest" \
  --selector "article" \
  --output news.md \
  --format markdown

Configuration File

Create scrape.yaml for complex scraping:

url: https://example.com/products
selectors:
  items: ".product-card"
  title: ".product-title"
  price: ".price::text"
  image: "img::attr(src)"
  link: "a::attr(href)"

pagination:
  type: click
  button: ".next-page"
  max_pages: 10

output:
  format: json
  file: products.json

Best Practices

  1. Check robots.txt before scraping
  2. Add delays between requests
  3. Cache responses for development
  4. Handle errors gracefully
  5. Store raw HTML for debugging

Legal Note

Ensure you have permission to scrape target websites. Check Terms of Service and robots.txt.

Comments

Loading comments...