Skillv1.0.1

ClawScan security

redbook-feedback-analyzer · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousApr 29, 2026, 12:21 PM

Verdict: suspicious
Confidence: high
Model: gpt-5-mini
Summary: The skill mostly does what its description says (scrape Xiaohongshu and analyze posts), but there are multiple inconsistencies and surprising behaviors — notably a hardcoded/internal LLM endpoint with embedded auth, instructions to use Chrome CDP that can access your logged-in browser session, and mismatches between the SKILL.md steps and the included scripts.
Guidance: Plain-language summary and recommendations: - What this skill will do: it scrapes Xiaohongshu search results and post pages using a local Chrome instance (via Chrome CDP/Playwright), classifies posts with an LLM, generates a Markdown report, and (per SKILL.md) is intended to insert the report into a KM document. - Surprising / risky things I found: - The LLM classification script (scripts/llm_classify.py) sends scraped post text to https://mmc.sankuai.com/openclaw/v1 using a hardcoded Authorization header (Bearer catpaw) and user-identifying headers. The SKILL.md does NOT mention this external endpoint or the embedded credential. That means your scraped data would be transmitted to that service automatically. - The scraper connects to your local Chrome via CDP (127.0.0.1:9222 by default). If you run Chrome with remote debugging and are logged in to sites (including km.sankuai.com or other services), the scripts can read pages and interact with them using your browser session cookies. That can expose private session data and allow the script to post or edit content in sites where you're authenticated. - SKILL.md promises writing the report to a specific KM document via CDP, but I couldn't find code that actually performs that insertion. There are mismatches between documented steps and the included scripts (e.g., llm_classify.py is not referenced in SKILL.md). This inconsistency makes the true runtime behavior unclear. - The Node script imports Playwright from a hardcoded absolute path (/root/.nvm/...). That is fragile and unusual; running it as-is may fail or behave unexpectedly. - SKILL.md contained unicode control characters flagged as prompt-injection signals — that can be used to manipulate automated processing and is unexpected. - Recommendations before installing or running: 1. Review and modify scripts locally — do not run them without inspection. In particular, open scripts/llm_classify.py and decide whether you trust the destination (mmc.sankuai.com) and the embedded Authorization token. Replace hardcoded credentials with an environment variable that you control if you intend to use your own LLM service. 2. If you only want local analysis, remove or disable the network call in llm_classify.py (or change BASE_URL to a trusted endpoint you control) so post text is not sent to an external service. 3. Run the scraper in an isolated environment (VM/container) and with a browser profile that is not logged into sensitive accounts. Prefer launching a dedicated Chrome instance with an empty profile and remote debugging for scraping, rather than connecting to your daily browser. 4. Verify the data flow: ensure the pipeline does what you expect (scrape → (optional classification) → local report). The SKILL.md and scripts are inconsistent about running llm_classify.py and about pushing the report to KM — confirm and implement only the steps you want. 5. Consider removing or sanitizing any hardcoded tokens/identifiers and require explicit user-provided API keys if external LLMs are used. Ask the author to document where scraped data is sent and how credentials are handled. 6. Because the SKILL.md contains a prompt-injection signal, be cautious about automated execution in multi-tenant systems or evaluation sandboxes; prefer manual code review and sandboxed testing. If you don't trust the remote endpoint or the embedded credential, do not run this skill until the code is changed to either (a) not call external services, or (b) call only endpoints you control with keys you provide.
Findings: [unicode-control-chars] unexpected: The SKILL.md contained unicode control characters flagged as potential prompt-injection. This may be an attempt to influence automated evaluators or hide content; it's not necessary for the described scraping/analysis task.

Review Dimensions

Purpose & Capability: concernName/description (Xiaohongshu run monitoring + feedback analysis) match the code (scraper + analyzers), but there are unexpected artifacts: llm_classify.py posts scraped content to a specific internal endpoint (https://mmc.sankuai.com/openclaw/v1) with a hardcoded Authorization header; SKILL.md does not mention this external LLM endpoint or the embedded credentials. SKILL.md also promises writing the report into a specific KM document via Chrome CDP, but no script implements that. These items are not proportional to the declared purpose or are undocumented.
Instruction Scope: concernSKILL.md instructs running the scraper (node) and analyze_feedback.py (python) and to insert the report into a KM doc via Chrome CDP. The repository also includes llm_classify.py (which sends post text to a remote LLM) but SKILL.md does not document running it. The scraper connects to a local Chrome CDP (127.0.0.1:9222) and will use the browser context/cookies — this can access logged-in sessions and private data. The SKILL.md contains a CDP URL parameter and an explicit KM URL target; that combination implies the skill will access and potentially modify pages in the user's logged-in browser context — a sensitive capability that is not fully explained or constrained in the instructions.
Install Mechanism: noteThere is no install spec (instruction-only install), which is low-risk. The code assumes Playwright is installed globally and even imports Playwright via a hardcoded absolute path (/root/.nvm/...), which is fragile and unusual. No remote downloads or archives are performed by an installer, but the scripts will attempt network connections at runtime (scraping XHS and calling an LLM endpoint).
Credentials: concernThe skill declares no required environment variables or credentials, yet llm_classify.py sends scraped post text to https://mmc.sankuai.com/openclaw/v1 using hardcoded headers including Authorization: Bearer catpaw and X-User-Id. That means user data will be transmitted to a specific remote service without the user's configuration or consent prompts. Also, the scraper uses the user's local Chrome CDP connection — if the user has an authenticated session in that browser, the code can access site content and insert into the KM document. The combination of sending data externally and accessing the user's browser session is disproportionate without explicit disclosure or opt-in controls.
Persistence & Privilege: okThe skill is not always-enabled and has no install mechanism altering system configs. It does not request persistent privileges or modify other skills. However, it does require an active Chrome instance with remote debugging enabled and will connect to it at runtime, which is a powerful but transient capability.