公众号作者文章抓取

Security checks across malware telemetry and agentic risk

Overview

This is a real local WeChat article crawler, but it needs Review because it uses persistent account login state and has under-scoped network, cleanup, and dependency behavior.

Install only on a trusted local machine and only if you are comfortable using a WeChat public-platform account with local browser automation. Before running, set scripts/config.json to your intended output directory, confirm the article target and batch size, do not share .playwright-profile, login_artifacts, cookies, tokens, or QR codes outside a trusted channel, and review the dependency/install behavior if you need reproducible or hardened execution.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain

Findings (12)

Tainted flow: 'article_url' from input (line 376, user input) → httpx.get (network output)

Medium

Category: Data Flow
Content: def fetch_seed_article_info(article_url: str) -> dict[str, Any]: response = httpx.get( article_url, headers={ "User-Agent": USER_AGENT,
Confidence: 97% confidence
Finding: response = httpx.get( article_url, headers={ "User-Agent": USER_AGENT, "Referer": "https://mp.weixin.qq.com/", }, follow_redirects=True,

Vague Triggers

High

Confidence: 95% confidence
Finding: The trigger guidance is overly broad and says to prioritize this skill even when the user only vaguely asks to 'get articles' or describes a downstream writing task. That creates a prompt-routing vulnerability where benign analysis or summarization requests may be escalated into shell execution, login flows, scraping, and local writes without sufficiently explicit user intent.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill operationalizes bulk scraping and local storage of third-party content but does not prominently warn about legal, policy, storage, or operational consequences before execution. In context, this is risky because the workflow includes batch downloading, persistent login artifacts, and disk writes, which can expose the user to compliance issues, excessive data retention, or unintended system impact.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation explicitly says the login QR code can be generated and sent over IM for scanning, but it does not warn that whoever scans that code may authenticate the session or gain access tied to the user's WeChat/public account workflow. In an agent-oriented setup, encouraging transfer of the QR artifact increases the chance of exposing a live authentication token or enabling unintended third-party login.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The document states that login state, status files, and Playwright profile data are stored locally, but it does not identify them as sensitive session artifacts. Those files may contain reusable cookies, session state, account metadata, or login workflow details that could let another local user or process hijack the authenticated session or learn private account information.

Unpinned Dependencies

Low

Category: Supply Chain
Content: beautifulsoup4>=4.12.3 httpx>=0.27.0 markdownify>=0.13.1 pillow>=11.1.0
Confidence: 94% confidence
Finding: beautifulsoup4>=4.12.3

Unpinned Dependencies

Low

Category: Supply Chain
Content: beautifulsoup4>=4.12.3 httpx>=0.27.0 markdownify>=0.13.1 pillow>=11.1.0 playwright>=1.52.0
Confidence: 95% confidence
Finding: httpx>=0.27.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: beautifulsoup4>=4.12.3 httpx>=0.27.0 markdownify>=0.13.1 pillow>=11.1.0 playwright>=1.52.0
Confidence: 90% confidence
Finding: markdownify>=0.13.1

Unpinned Dependencies

Low

Category: Supply Chain
Content: beautifulsoup4>=4.12.3 httpx>=0.27.0 markdownify>=0.13.1 pillow>=11.1.0 playwright>=1.52.0
Confidence: 97% confidence
Finding: pillow>=11.1.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: httpx>=0.27.0 markdownify>=0.13.1 pillow>=11.1.0 playwright>=1.52.0
Confidence: 91% confidence
Finding: playwright>=1.52.0

Known Vulnerable Dependency: markdownify — 1 advisory(ies): CVE-2025-46656 (markdownify allows large headline prefixes such as <h9999999>, which causes memo)

Low

Category: Supply Chain
Confidence: 78% confidence
Finding: markdownify

Known Vulnerable Dependency: pillow — 10 advisory(ies): CVE-2016-2533 (Pillow buffer overflow in ImagingPcdDecode); CVE-2023-50447 (Arbitrary Code Execution in Pillow); CVE-2021-27922 (Pillow Uncontrolled Resource Consumption) +7 more

Critical

Category: Supply Chain
Confidence: 96% confidence
Finding: pillow

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal