Wanxiang Scroll

Security checks across malware telemetry and agentic risk

Overview

This is a Chinese creative-writing and interactive-story skill with some local text-processing and novel-download helper scripts, and I found no hidden credential use, exfiltration, or destructive behavior beyond disclosed file writes.

Install this if you want a Chinese fiction-writing, roleplay, and style-reference pack. Before running scripts, review the command and output path: the crawlers contact third-party novel APIs and save text files, and the cleaner overwrites the source file unless you provide a separate output file. Do not treat the immersive persona or 'permanent' style rules as overriding your own instructions, platform policy, or normal safety boundaries.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (65)

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The file explicitly introduces '越权' behaviors that let the system override user flow, rewind state, and punish the user based on its own preferences. Even if framed as fictional flavor text, this normalizes unauthorized system interference and could be translated into agent behavior that disregards user intent, autonomy, or safety boundaries.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The skill hard-codes a permanent Chinese-only output format and explicitly says all output must follow a fixed structure, with no user language choice or opt-in. This is dangerous because it can override user preferences or platform language expectations, causing instruction-priority conflicts, degraded usability, and potential failure in multilingual or accessibility-sensitive contexts.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The skill explicitly allows broad natural-language control such as '帮我打开菜单' and '这个NPC叫什么名字', and states the system will automatically infer operational intent via fuzzy matching. This creates ambiguous activation boundaries where ordinary conversational text can be interpreted as commands, increasing the risk of unintended state changes, prompt-triggering, or command execution from user or embedded content.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The skill declares broad semantic parsing, fuzzy matching, and intent inference without defining strict activation boundaries or exclusions. In an interactive agent, this can cause ordinary narrative text to be misinterpreted as commands, leading to unintended state changes, tool invocation, or prompt-routing behavior, especially because the commands are short and natural-language adjacent.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The trigger design is overly broad because it allows activation not only through explicit commands but also through ordinary natural-language phrasing such as '用悬疑推理风格写一段...'. In a writing assistant context, this can cause unintended mode switches when a user is merely describing desired output, leading to unexpected behavior, policy bypass of default behavior, or accidental invocation of more sensitive styles without clear user confirmation.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The file explicitly instructs detailed depiction of injury and violence, then sanitizes the aftermath with childlike substitutions. That combination can bypass ordinary sensitivity expectations and lead an agent to generate disturbing violent content for users without warning, especially because the surrounding framing is '子供向' and aesthetically softened.

Missing User Warnings

High

Confidence: 98% confidence
Finding: This section normalizes self-harm, mutilation, and extreme violence as a narrative technique ('自残、同归于尽或极度暴力的手段') without any cautionary boundary. In an agent skill, that can steer outputs toward unsafe self-harm-adjacent content and intense violent scenarios, increasing the chance of harmful generation or policy evasion through 'sanitized' cartoon framing.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The activation phrase is very broad: a simple user utterance like “文风 [名称]” or “切换文风 [名称]” can trigger a global behavioral shift without defining scope, confirmation requirements, or exclusions. Because the skill states that the style change affects the model’s “底层物理法则,” it suggests a wide-reaching mode switch that could override normal conversational boundaries or cause unintended persistent behavior changes across unrelated requests.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The activation phrases '/切换文风：人话模式' and '/导演你醒醒' are broad, informal commands with no clear scoping to this skill's function. In a larger agent environment, such loosely defined triggers can cause unintended invocation or style changes outside the user's explicit intent, increasing prompt-injection and behavior-manipulation risk.

Natural-Language Policy Violations

Medium

Confidence: 83% confidence
Finding: The skill instructs users to issue fixed Chinese-language commands, which implicitly assumes and enforces Chinese interaction without checking the user's language preference. This can reduce user control, create accessibility issues, and in multilingual agent systems lead to misfires or hidden behavior that some users cannot understand or safely invoke.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The file frames stylistic rules as absolute, universal prohibitions ('铁律', '凌驾一切', '一经发现，段落重写'), which can cause an agent to override user preferences or legitimate task requirements without opt-in. In a skill context, this is dangerous because it imposes a rigid output policy that may conflict with higher-priority instructions, reduce adaptability, and encourage non-compliant behavior across unrelated prompts.

Missing User Warnings

Low

Confidence: 85% confidence
Finding: The workflow instructs the agent to automatically save multiple files and maintain a status JSON on the user's filesystem, but it does not require explicit user consent, path confirmation, or safety constraints. In an agent setting, implicit write behavior can cause unwanted file creation or overwriting, especially if the agent infers paths or executes these steps automatically.

Natural-Language Policy Violations

Medium

Confidence: 91% confidence
Finding: The skill hard-codes Chinese-only output and stylistic constraints across the entire file without exposing any user language choice or documenting a necessary locale restriction. This can override user intent, reduce accessibility, and create unsafe downstream behavior in multilingual agent environments where the system is expected to honor user language preferences or policy disclosures.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document explicitly states that if no output file is provided, the tool will overwrite the input file in place, but it does not warn users about data loss or recommend backups. This creates a real safety issue because users may unintentionally destroy original content, especially when using an automated text-cleaning tool on valuable manuscripts.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly describes persistent style learning and adaptation to user preferences without any notice, consent flow, retention policy, or boundary on what user-provided content is stored. In a conversational writing system, this creates a real privacy risk because users may unknowingly provide personal data, proprietary text, or sensitive stylistic samples that become part of a retained profile or library.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The content repeatedly states that user-provided styles, samples, and technical definitions are saved into the system library and integrated into the base style corpus, but gives no privacy warning or storage consent. This is a true vulnerability because it normalizes persistence of user input as system knowledge, which can lead to unannounced retention of sensitive text, copyrighted material, or identifying writing patterns.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The skill treats any slash-prefixed text as a high-priority command and says it will 'immediately' interrupt normal behavior to execute it, without defining scope, authorization, or safety boundaries. That broad trigger surface can cause accidental invocation or make it easier for untrusted user content embedded in conversation to steer the skill unexpectedly.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: The skill imposes a rigid output format and narrative style as a mandatory rule, including prohibitions on normal direct address and explanatory language, with no user opt-in. This can override user needs, reduce transparency, and interfere with safer or clearer responses when the model should adapt tone or provide straightforward explanations.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The skill hard-codes a specific narrative persona and output style, including mandatory comedic tone and ENTP-style meta commentary, without giving the user a way to choose language, locale, or communication style. This can override user preferences and reduce controllability, which is a genuine quality/safety issue, though it is not directly a high-severity security exploit in this context.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The skill defines an auto-trigger on the first response of every conversation and on a broad command path, which can cause unexpected activation outside a clearly scoped user intent. In an agent setting, overly broad activation increases the chance of prompt hijacking, unintended mode switching, or disruption of normal assistant behavior because the skill may seize control of responses before explicit consent.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The activation command '/安装补丁' is presented as a simple trigger without scope restrictions, confirmation safeguards, or exclusion conditions. In an agent setting, ambiguous trigger phrases can cause accidental mode changes or unauthorized behavior activation if the text is echoed, quoted, or referenced by the user.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The trigger condition is intentionally broad and subjective, using phrases like 'when the world is about to collapse' or when the 'observer is about to quit,' with no measurable gating conditions. In an agent skill, this can cause the override behavior to activate in many unintended situations, leading to excessive intervention, logic bypass, or disruption of normal control flow whenever outputs appear undesirable rather than truly exceptional.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The skill declares the immersive protocol as "ALWAYS ACTIVE (常驻后台)", which creates a broad, persistent activation condition not clearly bounded to specific user intents or contexts. In practice this can cause the behavior to override normal instruction handling, increase prompt-injection resilience against safety controls, and make the agent apply manipulative or reality-blurring narration even when the user did not explicitly request it.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The skill treats minimal input such as "..." as sufficient to automatically "take over the performance," which is an overly permissive trigger. This allows the skill to infer intent and escalate behavior without meaningful user consent, increasing the chance of unwanted roleplay, manipulative framing, or bypass of normal clarification steps.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The skill invites users to directly input open-ended commands like '/召唤 [金手指]', '/启动 [系统流]', and '/场景 [拍卖会]' without defining clear boundaries for when the skill should activate or what inputs are in scope. In an agent setting, this kind of broad triggering guidance can cause unintended invocation, prompt collisions with normal user text, or misuse of the skill outside its intended narrative-assistance context.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal