Security audit

Web Novel Master

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Chinese web-novel writing skill that writes local draft/project files and stores local preferences, with no evidence of exfiltration, credential access, destructive behavior, or hidden installation code.

Install if you want a Chinese web-novel assistant that can create local project folders, draft many chapter files, update planning JSON, and remember preferences locally. Review the generated plan before allowing long or parallel drafting, and delete or reset user-preferences.json if you do not want creative preferences or creation history retained. Treat coercive-romance and threat-heavy templates as optional genre references, not safe default guidance.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (17)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file defines persistent storage of user preferences in `user-preferences.json` and instructs the system to silently sync phase results into that file. This creates unnecessary retention of user data beyond the immediate writing task, increasing privacy risk and making accidental over-collection likely if users share sensitive tastes, history, or creative plans.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The preference system stores profiling-style data such as favorite genres, style references, dislikes, and `creationHistory`, which can accumulate a behavioral record of the user. For a novel-writing assistant, some customization is reasonable, but this breadth of history tracking is not clearly necessary and becomes a privacy risk when persisted automatically.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The activation text uses broad everyday writing prompts such as '写小说' and '创作故事', which can cause the skill to trigger in situations where the user did not intend long-form web-novel workflow behavior. Over-broad triggering can hijack normal conversations, override more appropriate tools, and unexpectedly pull the interaction into a multi-phase process.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The metadata trigger list is ambiguous and lacks boundaries, so the platform may activate this skill for loosely related writing tasks. Ambiguous triggers increase the chance of unintended invocation, which can degrade user control and route content into an unsuitable workflow with persistence/continuation assumptions.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The file content is entirely in Chinese and provides no indication that users can request or receive output in another language. This can cause the skill to ignore user language preferences, reduce transparency, and create accessibility and usability issues, especially in multilingual environments. In this context it is not a direct security exploit, but it is still a policy/quality vulnerability because the skill behavior may be improperly constrained without user consent.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: This section contains romanticized coercive relationship dynamics such as possessive language, threats, and forceful physical actions (for example, control-oriented declarations and intimidation) presented as reusable writing templates without any safety framing. In a creative-writing skill, that can normalize abusive behavior and encourage the generation of harmful relationship content, especially when users ask for romance or teen-oriented scenes.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: These templates include explicit threats of killing, severe retaliation, and violent domination as punchline-style 'cool lines' for users to directly reuse. Because the skill is designed to help generate compelling fiction scenes, presenting such threats without warning or moderation guidance increases the likelihood that the assistant will produce glamorized violent or revenge-oriented content on demand.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: This reference file contains exemplar lines and trope guidance that normalize coercive, possessive, and explicitly non-consensual romantic behavior, including phrases like '你只能是我的' and '霸王硬上弓'. In a writing-assistant skill, such content can directly steer generated output toward sexual coercion or abusive relationship framing without consent safeguards, making policy-violating generation more likely.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: The flow instructs the agent to create a local project directory and write a JSON planning file without any explicit user notice, confirmation step, or constraint on when filesystem changes are allowed. This creates an unauthorized side-effect risk: an agent following the skill literally could write to disk during what appears to be a clarification phase, violating user expectations and potentially causing unwanted persistence.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The workflow explicitly directs the agent to proceed without user confirmation while reading and modifying project files, including chapter outputs and the writing-plan JSON. This creates an autonomy/consent problem: the agent can make persistent file changes the user did not explicitly approve in the moment, increasing the risk of unexpected overwrites or large-scale edits.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The sub-agent and team-member prompts suppress AskUserQuestion while those agents are authorized to write files and update task/state data autonomously. In a multi-agent setting this is more dangerous because multiple workers can make persistent changes concurrently, reducing visibility and making unintended modifications harder to catch before they spread.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The activation condition only says the selector is used after 'Phase 0 初始化完成后，用户选择创作模式', but it does not define strong input gating or clear criteria for when this flow should be invoked. In an agent setting, ambiguous triggers can cause the wrong workflow to activate on loosely related user prompts, leading to unintended behavior, mode confusion, or bypass of safer/scoped flows.

Natural-Language Policy Violations

Medium

Confidence: 82% confidence
Finding: The file presents the mode selector entirely in Chinese and does not offer any language negotiation or user-language fallback. This can cause misunderstanding of workflow choices, incorrect user selections, and reduced transparency for non-Chinese users, which is a genuine safety and usability issue, though not a high-severity security flaw in this writing-assistant context.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The workflow explicitly instructs the agent to create chapter files and update a JSON tracking file as part of normal execution, but provides no requirement to notify the user or obtain explicit consent before making local file changes. In an agent environment, silent repeated writes can surprise users, overwrite prior work, or modify unintended workspace content, especially when combined with automatic continuation across many chapters.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The instruction '全程无需向用户确认，逐章创作直到完成' removes a key safety boundary by telling the agent to continue autonomous multi-step work without additional confirmation. In context, this is dangerous because the same flow also includes file creation and status updates, enabling repeated modifications without renewed consent or opportunities for the user to interrupt unintended behavior.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The instruction to 'silently' synchronize user preferences after each phase means data is persisted without a prominent notice or explicit consent at the time of collection. Silent persistence is dangerous because users may reveal sensitive personal interests or writing intentions assuming they are only part of an ephemeral conversation.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The natural-language instructions explicitly require ongoing retention of user preferences and appending finished works to `creationHistory`, creating a durable record of user behavior. In the context of a creative-writing skill, this is more dangerous because users may disclose highly personal themes, fantasies, or autobiographical material that they do not expect to be stored long term.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.