Security audit

Story Setup

Security checks across malware telemetry and agentic risk

Overview

This is a mostly coherent writing-project setup skill, but it installs persistent hooks and browser/session automation pathways that deserve careful review before use.

Install only in a writing project where persistent hooks, project-file modifications, git-hook warnings, and optional browser-based research are acceptable. Review the generated deployment plan, avoid enabling browser-CDP workflows unless you are comfortable reusing an authenticated browser session, and disable the update check with STORY_NO_UPDATE_CHECK=1 if automatic GitHub requests are not acceptable.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (29)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 97% confidence
Finding: The skill orchestrates file reads, shell commands, and broad project modifications but does not declare permissions up front. That makes the trust boundary unclear for users and increases the chance they invoke a powerful deployment skill without understanding it will inspect the workspace, execute CLI commands, and write config/hooks into multiple locations.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 99% confidence
Finding: The skill is described as setup/deployment only, but its own spec deploys enforcement hooks that block writes, scan prose contents, inspect git changes, inject session context, and even perform outbound version checks. This mismatch is dangerous because a user may consent to 'infrastructure setup' without realizing they are also installing persistent monitoring/enforcement behavior that affects future sessions and commits.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The agent is instructed to run local commands such as `git rev-parse`, Python-based file length checks, and optional Node-based scans. For a prose-writing agent, this expands its capability from content editing into local environment interaction, which increases the attack surface: a prompt-injected workflow could steer it into touching unintended files, invoking local interpreters, or relying on project scripts that may themselves be unsafe.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The instructions require the agent to create or update `追踪/上下文.md` and related tracking state outside the core output file. This gives a writing agent authority to modify project metadata and persistent state, which can be abused to overwrite user-maintained files, poison downstream context, or create hidden prompt-persistence channels for later agent runs.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The skill instructs the subagent to run Python or shell commands for word-count verification, which goes beyond pure narrative editing and introduces executable-behavior expectations. Even though the commands shown are benign, normalizing interpreter/shell use in a content-writing subagent increases attack surface and can lead to unsafe command adaptation or misuse when file paths or follow-on instructions are attacker-controlled.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The skill silently expands from narrative writing into project-state management by requiring creation and modification of `追踪/上下文.md` and possibly the `追踪/` directory. This is a capability expansion not reflected in the higher-level description, so users may authorize a writing helper without realizing it will persistently alter tracking metadata and project structure.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The document explicitly limits behavior to creating new files and forbids modifying existing files, but the manifest grants broad edit capability. That mismatch is dangerous because the runtime permission model would still allow overwrites or edits in the user project if the agent is prompted incorrectly, compromised, or simply mis-implements its own instructions.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This file introduces a browser automation capability via CDP, including data scraping and browser control, which is materially unrelated to the stated purpose of the parent skill: deploying writing-project infrastructure. That mismatch expands the skill's effective authority and can mislead users or downstream agents into invoking powerful browser actions under the umbrella of a benign setup tool.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The command explicitly advertises controlling Chrome through CDP and reusing an existing logged-in session, which can expose authenticated data and enable actions on behalf of the user. In the context of a writing-environment setup skill, these are unjustified privileged capabilities and create a serious risk of account misuse, data exfiltration, or unauthorized web actions.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This command file invokes a different skill ('story-long-scan') than the declared purpose of the enclosing skill ('story-setup'), creating a capability mismatch. In an agent environment, this can cause unintended execution paths, user confusion, and routing to higher-risk or unrelated functionality without clear consent.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The agent is declared read-only and only granted Read/Glob/Grep, yet the instructions require executing `git rev-parse --show-toplevel` to determine the project root. This creates a capability/instruction mismatch: an orchestrator or future implementation may silently add shell execution to satisfy the prompt, expanding the agent's effective privileges beyond its declared boundary.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The file explicitly states the agent must not use Bash and is strictly read-only, but later instructs it to execute a shell command. Contradictory safety constraints are dangerous because they normalize policy bypass: implementations may ignore tool restrictions, auto-escalate privileges, or route execution through less-audited paths to satisfy the instruction.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: This research-focused agent is granted Bash even though its primary stated role is searching, extracting, and writing reference material. Bash materially expands capability to execute arbitrary shell commands against the local environment, which becomes risky because the agent also consumes untrusted web content and user-provided parameters such as project_dir and cdp_port; prompt injection or unsafe command construction could turn a research task into local command execution.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The hook's stated purpose is to display project status and writing context, but it also performs an outbound GitHub API request for version checking during session start. Even though the request is read-only and rate-limited, it creates undeclared network behavior, leaks client metadata typical of HTTP requests, and expands the trust boundary of a startup hook that users may expect to be local-only.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: This code initiates unsolicited external network access to GitHub from a hook whose functional role is unrelated to remote communication. In a hook that runs automatically at session start, even passive telemetry-like behavior is risky because it can surprise users, fail in restricted environments, and disclose environment metadata without a clear prompt.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The trigger phrases include broad natural-language requests like '帮我搭一下环境' and '配置写作项目', which can accidentally activate a skill that performs extensive filesystem changes. Because this skill deploys hooks, agents, and config across multiple toolchains, unintended invocation can lead to surprising persistent modifications in the user's repository.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The opening description does not prominently warn that the skill will create and modify many files across the project, including hooks, agent definitions, config files, and git hooks. For a deployment skill with persistent side effects, the lack of an upfront modification warning undermines informed consent and increases the risk of unsafe installation into sensitive repositories.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The file is entirely written in Chinese and does not offer any language-selection mechanism or fallback, which can force a locale on users who operate in other languages. In this skill context, that can reduce usability, increase misunderstanding of instructions, and make the deployed writing infrastructure less accessible, though it does not create a direct code-execution or data-exfiltration risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill mandates direct Write/Edit operations whenever a file path is present and instructs the agent not to return content, which removes a natural review checkpoint before modification. This can cause unintended file changes, overwrite user content, or hide broad edits behind a short summary, especially when invoked transitively by another agent.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The automatic update of `追踪/上下文.md` and creation of the `追踪/` directory are undisclosed side effects that persist beyond the immediate writing task. Hidden persistence is risky because it can clutter repositories, overwrite existing tracking practices, and create trust issues when users believe they requested only content generation.

Missing User Warnings

Low

Confidence: 73% confidence
Finding: The skill is designed to write a research file into the user's project directory, but the description does not clearly warn users that files will be created and may collide with existing paths or filenames. In context this is less severe than arbitrary code execution, but it still creates a safety and integrity risk because users may not realize the agent performs persistent filesystem changes.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The invocation text is generic and does not constrain when or why the command should run, which increases the chance of accidental or inappropriate triggering in conversational workflows. In a multi-skill agent, broad trigger language can cause unintended skill activation and execution of unrelated actions.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The command text is a generic instruction to use the skill for long-form writing help, which can overlap with ordinary user requests about writing. In agent systems that auto-route based on trigger phrasing or command references, this broad invocation increases the chance of unintended skill activation and unexpected behavior in unrelated writing contexts.

Natural-Language Policy Violations

Medium

Confidence: 88% confidence
Finding: The description and invocation are written only in Chinese and imply Chinese-language usage without any user-choice mechanism. This can cause user intent mismatch, unexpected language switching, or degraded safety/usability if the surrounding system or user expects another language.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The command text simply says to use the story-setup skill to deploy infrastructure, but it does not define any trigger constraints, scope limits, or confirmation requirements before making project-wide changes. In a setup skill that installs hooks, rules, agents, and command files into a user project, this ambiguity increases the chance of unintended invocation or overbroad execution that modifies files the user did not explicitly approve.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.dangerous_exec

Shell command execution detected (child_process).

Critical

Code: suspicious.dangerous_exec
Location: references/opencode/plugin.ts:16