Skillv4.5.0

ClawScan security

Cfc Disclosure Monitor · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousApr 17, 2026, 11:04 AM

Verdict: suspicious
Confidence: medium
Model: gpt-5-mini
Summary: The skill does what it says (scrape disclosures → parse PDFs → build an ontology) but the package omits install/dependency declarations and quietly calls an external OCR service using an undeclared, hard-coded API key — review before running.
Guidance: This skill's code matches its stated goal (scraping 30 firms' disclosures, parsing PDFs, building an ontology), but there are several practical and security concerns you should consider before running it: - External OCR service: clean_and_eval.py will upload images to https://api.minimax.chat. The file contains a hard-coded default API key that will be used if you don't set MINIMAX_API_KEY yourself. That means data (images/PDF content) may be transmitted to a third party under someone else's key. Replace the default key with your own, or remove the external OCR calls if the data is sensitive. - Undeclared dependencies: The package does not declare required binaries or a reproducible install step. The code uses Playwright (which downloads browser engines), pdfplumber, trafilatura, and other Python packages. Install and run only in a controlled environment (container or VM) and review requirements before installing browser binaries. - Network activity and data storage: The skill will fetch many pages and attachments from external websites and store them under your workspace (~/.openclaw/... and /tmp). Expect significant outbound HTTP traffic and local files. Run it on isolated infrastructure if you need to protect other data. - Audit the remaining source: The provided excerpts show the hard-coded API key and many network operations; inspect the rest of the files (truncated in the manifest) for other hidden endpoints or secrets before trusting the skill. If you plan to use this skill: (1) audit the source, (2) set MINIMAX_API_KEY to your own key or remove the service calls, (3) run in a sandboxed environment, and (4) add/verify an install spec that documents dependencies and their expected behavior.
Findings: [HARD_CODED_SECRET_MINIMAX_KEY] unexpected: clean_and_eval.py contains a default MINIMAX_API_KEY literal (looks like an API secret) and will be used if the MINIMAX_API_KEY env var is not set. This is not documented in SKILL.md and results in sending images to https://api.minimax.chat under that key.

Review Dimensions

Purpose & Capability: concernThe name/description (采集消金公司公告并构建知识图谱) match the code: many collectors, PDF parsing, OCR, and ontology writers are present. However the SKILL.md and registry metadata declare no required binaries or env vars while the code clearly expects heavy runtime dependencies (Playwright, pdfplumber, trafilatura, etc.) and an external VLM OCR integration. The omission of these runtime requirements is a mismatch and should have been declared.
Instruction Scope: concernSKILL.md instructs running pipeline.py / collect.py / phase2/3 scripts which is consistent with the code. It does NOT call out that attachments/images/PDFs downloaded from third‑party websites will be uploaded to an external OCR API (https://api.minimax.chat) by clean_and_eval.py, nor does it document the effect of the default API key fallback. The runtime instructions therefore omit a notable external-network data flow.
Install Mechanism: noteThere is no install spec (instruction-only at packaging level). That lowers install‑time risk, but the code requires non-trivial Python packages and Playwright (which downloads browser binaries at runtime). Absence of declared dependency/install steps is an operational gap and increases risk if a user runs the skill without properly sandboxing or vetting dependencies.
Credentials: concernThe skill metadata declares no required environment variables, yet clean_and_eval.py reads MINIMAX_API_KEY and supplies a hard-coded default API key in the source. This means, unless the user sets their own key, images will be sent to a third‑party service using the embedded key (undeclared). That is disproportionate to what the manifest claims and introduces potential privacy, billing, and exfiltration concerns.
Persistence & Privilege: okThe skill is not always-enabled and is user-invocable (normal). It writes output to local workspace dirs (cfc_raw_data, memory/ontology/graph.jsonl) and /tmp for downloads — these are expected for a scraper/ETL skill and do not request elevated platform privileges.