Security audit

Xiaohu WeChat Format

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly a disclosed WeChat formatting and publishing tool, but it includes public comment auto-reply and file/content mutation paths that lack clear runtime confirmation gates.

Install only if you are comfortable granting WeChat publishing credentials and AI API credentials to this workflow. Treat comment auto-reply as a Review-risk feature: run it in --dry-run first, confirm every batch before sending, and verify the AI endpoint. Use narrow image search paths, avoid broad vault/home roots, and review generated cover/article edits before publishing.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (9)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The sub-skill is presented as a cover-image generator, but it also instructs the agent to directly modify the user's article by inserting an image reference without confirmation. This exceeds the declared scope of the skill and creates an unsafe write action that can alter user content unexpectedly, especially because the modification is triggered by merely providing an article path.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: This script implements automatic comment-reply behavior, which is materially different from the formatting/publishing-oriented scope described in the skill metadata. Scope expansion matters because it introduces moderation automation and outbound platform actions that users may not expect when installing a formatter skill.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The script can post replies to WeChat comments automatically after generating content, with no explicit runtime confirmation gate beyond an optional --dry-run mode. Because the skill metadata says external writes like comment auto-replies require explicit user confirmation, this mismatch creates a real risk of unintended public actions on the user's account.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The formatter walks every directory under the user-supplied vault root and any configured image search paths, with followlinks=True, and copies any matching filename into the output directory. This can unintentionally traverse symlinked locations outside the intended workspace and exfiltrate local files into generated output if a markdown document references a sensitive filename or if broad search paths are configured.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: This file implements a generic external AI image-generation client, including prompt ingestion, reference-image upload, configurable model selection, and saving generated outputs, which is materially outside the declared WeChat formatting/publishing scope of the skill. In a skill that users would trust for Markdown-to-WeChat formatting and draft publishing, this hidden extra capability expands the attack surface, can exfiltrate user content or local files to third-party endpoints, and violates the principle of least privilege.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The script constructs request URLs from configurable base_url, model, and API version values, allowing arbitrary remote model invocation rather than a narrowly scoped WeChat-related operation. In the context of a formatting skill, this makes the component more dangerous because prompts, reference images, and potentially sensitive article content can be sent to attacker-controlled or unintended endpoints if configuration is altered, enabling data exfiltration and abuse of stored API credentials.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill directs the agent to modify article files by default and explicitly says not to ask the user first. In a skill that accepts file paths, silent modification is dangerous because it can overwrite or corrupt user content, violate user intent, and normalize unauthorized writes from natural-language prompts.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Comment text and article digest are transmitted to a third-party AI service to generate replies, but the script description does not clearly warn users that reader content and article content may leave the WeChat environment. This can expose unpublished or sensitive business context, reader submissions, or personal data to an external processor without informed consent.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script automatically fetches arbitrary external image URLs embedded in article HTML and re-uploads them to WeChat without explicit user warning or confirmation. This can leak the operator's IP/network metadata to third-party hosts, unintentionally transmit sensitive or private image resources onward to WeChat, and creates an SSRF-like capability if untrusted HTML is supplied.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.