Skillv1.0.0

ClawScan security

pepper-oil-scraper · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousMar 20, 2026, 12:32 PM

Verdict: suspicious
Confidence: medium
Model: gpt-5-mini
Summary: The skill's code and runtime instructions match its stated purpose (a multi-site Python web-scraper for pepper/oil industry data), but the declared install spec is inconsistent and minor details deserve checking before installation.
Guidance: This package appears to be a legitimate multi-site Python web scraper for pepper/pepper-oil industry data, but review these before installing: - Install step mismatch: the skill metadata lists an install of kind "node" while the project is Python. Don't rely on the metadata installer — create a Python virtual environment and run the SKILL.md pip install command yourself (pip install requests beautifulsoup4 lxml pandas openpyxl aiohttp fake-useragent). Avoid using --break-system-packages unless you understand its effects. - Inspect config/targets.json before running: it can contain site list, proxy settings, company lists or HS codes; make sure no private or unexpected endpoints/proxies are configured. - Run in an isolated environment (container or VM) and as a non-root user to limit blast radius in case of mistakes. - Legal/ethical caution: the tool is a crawler. Verify target sites' robots.txt and terms of service and avoid scraping pages that require authentication or are behind paywalls. The code includes anti-anti-crawl techniques (fake-useragent, proxy support, delay/backoff) — be careful and lawful when using them. - Confirm optional dependencies: JS-heavy sites mention playwright — that requires separate installation and can be large; only install if needed. If you want, I can (a) list the exact packages and versions to install in a virtualenv, (b) parse config/targets.json for suspicious proxy/URL entries, or (c) point out any adapters that mention downloading PDFs or interacting with APIs so you can audit those specific behaviors.

Review Dimensions

Purpose & Capability: okName/description claim a Python-based scraper for pepper/pepper-oil industry data from ~20 sites; the package contains many Python scraper adapters, a main scheduler, data-cleaner and export tools and a config.targets.json listing sites — all consistent with the stated purpose.
Instruction Scope: okSKILL.md gives concrete pip install commands and examples to run the Python scripts. The runtime instructions only describe scraping public websites and producing local JSON/XLSX outputs; they do not instruct reading unrelated system files or sending data to hidden endpoints. They do suggest optional proxy and playwright usage for JS sites.
Install Mechanism: concernThe metadata.install entry uses kind: "node" (id: pip-deps) despite the project being pure Python and the SKILL.md instructing pip installs. This mismatch looks like a packaging/metadata error and may mean the declared install step won't run as intended. The SKILL.md itself asks the user to run pip install (no external archive downloads or obscure URLs), so risk is low but the metadata inconsistency should be fixed or the user should manually install Python deps in a venv.
Credentials: okThe skill requests no environment variables or credentials. The code reads only its included config/targets.json and writes outputs to the configured output directory. No credentials or unrelated secrets are requested or referenced in the provided files.
Persistence & Privilege: okFlags are default (always:false, user-invocable:true). The skill does not request persistent platform privileges or modify other skills/configs. It runs local Python scripts and saves results to the filesystem only.