公众号作者文章抓取

Security checks across malware telemetry and agentic risk

Overview

This is a real local WeChat article crawler, but it needs Review because it uses persistent account login state and has under-scoped network, cleanup, and dependency behavior.

Install only on a trusted local machine and only if you are comfortable using a WeChat public-platform account with local browser automation. Before running, set scripts/config.json to your intended output directory, confirm the article target and batch size, do not share .playwright-profile, login_artifacts, cookies, tokens, or QR codes outside a trusted channel, and review the dependency/install behavior if you need reproducible or hardened execution.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
Findings (12)

Tainted flow: 'article_url' from input (line 376, user input) → httpx.get (network output)

Medium
Category
Data Flow
Content
def fetch_seed_article_info(article_url: str) -> dict[str, Any]:
    response = httpx.get(
        article_url,
        headers={
            "User-Agent": USER_AGENT,
Confidence
97% confidence
Finding
response = httpx.get( article_url, headers={ "User-Agent": USER_AGENT, "Referer": "https://mp.weixin.qq.com/", }, follow_redirects=True,

Vague Triggers

High
Confidence
95% confidence
Finding
The trigger guidance is overly broad and says to prioritize this skill even when the user only vaguely asks to 'get articles' or describes a downstream writing task. That creates a prompt-routing vulnerability where benign analysis or summarization requests may be escalated into shell execution, login flows, scraping, and local writes without sufficiently explicit user intent.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill operationalizes bulk scraping and local storage of third-party content but does not prominently warn about legal, policy, storage, or operational consequences before execution. In context, this is risky because the workflow includes batch downloading, persistent login artifacts, and disk writes, which can expose the user to compliance issues, excessive data retention, or unintended system impact.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The documentation explicitly says the login QR code can be generated and sent over IM for scanning, but it does not warn that whoever scans that code may authenticate the session or gain access tied to the user's WeChat/public account workflow. In an agent-oriented setup, encouraging transfer of the QR artifact increases the chance of exposing a live authentication token or enabling unintended third-party login.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The document states that login state, status files, and Playwright profile data are stored locally, but it does not identify them as sensitive session artifacts. Those files may contain reusable cookies, session state, account metadata, or login workflow details that could let another local user or process hijack the authenticated session or learn private account information.

Unpinned Dependencies

Low
Category
Supply Chain
Content
beautifulsoup4>=4.12.3
httpx>=0.27.0
markdownify>=0.13.1
pillow>=11.1.0
Confidence
94% confidence
Finding
beautifulsoup4>=4.12.3

Unpinned Dependencies

Low
Category
Supply Chain
Content
beautifulsoup4>=4.12.3
httpx>=0.27.0
markdownify>=0.13.1
pillow>=11.1.0
playwright>=1.52.0
Confidence
95% confidence
Finding
httpx>=0.27.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
beautifulsoup4>=4.12.3
httpx>=0.27.0
markdownify>=0.13.1
pillow>=11.1.0
playwright>=1.52.0
Confidence
90% confidence
Finding
markdownify>=0.13.1

Unpinned Dependencies

Low
Category
Supply Chain
Content
beautifulsoup4>=4.12.3
httpx>=0.27.0
markdownify>=0.13.1
pillow>=11.1.0
playwright>=1.52.0
Confidence
97% confidence
Finding
pillow>=11.1.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
httpx>=0.27.0
markdownify>=0.13.1
pillow>=11.1.0
playwright>=1.52.0
Confidence
91% confidence
Finding
playwright>=1.52.0

Known Vulnerable Dependency: markdownify — 1 advisory(ies): CVE-2025-46656 (markdownify allows large headline prefixes such as <h9999999>, which causes memo)

Low
Category
Supply Chain
Confidence
78% confidence
Finding
markdownify

Known Vulnerable Dependency: pillow — 10 advisory(ies): CVE-2016-2533 (Pillow buffer overflow in ImagingPcdDecode); CVE-2023-50447 (Arbitrary Code Execution in Pillow); CVE-2021-27922 (Pillow Uncontrolled Resource Consumption) +7 more

Critical
Category
Supply Chain
Confidence
96% confidence
Finding
pillow

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal