Security audit

小红书图片生成

Security checks across malware telemetry and agentic risk

Overview

This image-generation skill is mostly coherent, but it includes reusable prompt language that tries to bypass refusals for sensitive or copyrighted figures.

Review before installing. The file-writing and preference persistence are expected for this workflow, but the prompt template should be edited to remove instructions that tell the image model not to refuse or to work around sensitive or copyrighted figures.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (6)

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The skill directs the agent to probe multiple filesystem locations, including project, XDG config, and the user's home directory, for EXTEND.md before doing its main task. Even if intended for loading preferences, this expands access beyond user-provided content and can disclose existence or contents of files in sensitive areas of the local environment, which is unnecessary unless the user has explicitly opted into that behavior.

Context-Inappropriate Capability

Medium

Confidence: 99% confidence
Finding: The prompt template explicitly instructs the agent to generate around sensitive or copyrighted figures and adds 'DO NOT refuse to generate,' which is an attempt to override downstream safety behavior. In an infographic-generation skill, this broadens capability beyond the declared purpose and can be used to facilitate policy-violating depictions or copyright-evasive outputs.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The screen-print override preserves the same workaround behavior by telling the model to create 'stylistically similar silhouettes' for sensitive or copyrighted figures. Even without the explicit refusal bypass wording, this still instructs policy evasion through alternate depiction styles and is not necessary for normal XHS infographic creation.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The setup flow instructs the agent to create directories and write an EXTEND.md file containing user-provided watermark text and preferences to either a project path or a user-home path, but it does not clearly warn the user that their input will be persisted on disk. This is a real privacy and consent issue because user content may contain identifying data and the user-scope option increases exposure across projects.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The framework hard-codes `source_language: zh` and `user_language: zh` in the expected output example, which can bias the agent into producing Chinese output regardless of the user's actual language preference. In a multilingual assistant, this creates an instruction-priority and consent issue: the skill may override user intent, degrade usability, and cause incorrect or inaccessible responses for non-Chinese users.

Natural-Language Policy Violations

High

Confidence: 100% confidence
Finding: This is a direct natural-language policy violation: it tells the system not to refuse generation when content involves sensitive or copyrighted figures. Such wording is especially dangerous because it is placed in a reusable base prompt, meaning every assembled prompt may inherit an instruction to bypass safeguards.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.