skill-setup-flow

Security checks across malware telemetry and agentic risk

Overview

This setup helper is not malware, but it can make lasting changes to core agent behavior and memory files for other skills without strong scoping or approval controls.

Install only if you want a meta-skill that helps configure other skills and you are comfortable reviewing persistent changes. Before use, require the agent to show every proposed file creation and edit, especially changes to SOUL.md, AGENTS.md, MEMORY.md, and TOOLS.md; avoid using it on untrusted skills; and keep secrets out of markdown config or setup logs.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Memory PoisoningPersistent Context Injection, Context Window Stuffing, Memory Manipulation
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (19)

Vague Triggers

Medium
Confidence
89% confidence
Finding
The trigger examples are broad natural-language phrases such as creating or initializing setup for arbitrary skills, without clear invocation boundaries or disambiguation rules. In an agent environment, this can cause unintended activation during ordinary conversation, leading the agent to create files, modify core documents, or start setup workflows the user did not explicitly request.

Vague Triggers

Low
Confidence
80% confidence
Finding
Claiming the workflow can create setup for 'any installed skill' without stating applicability limits or safety constraints encourages overbroad use. This can cause the agent to apply the same setup pattern to incompatible or sensitive skills, resulting in incorrect configuration, unwanted file changes, or propagation of unsafe assumptions across skills.

Vague Triggers

Medium
Confidence
91% confidence
Finding
The quick-start phrases are extremely broad, generic natural-language triggers such as '初始化 {skill-name} 配置' and '启用 {skill-name}', which can be matched during ordinary conversation or administrative discussion. In a skill that creates files and updates core agent documents, weak activation boundaries increase the chance of unintended execution and unauthorized workspace modification.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
This section describes creating directories and files and later updating high-value files like SOUL.md, AGENTS.md, and MEMORY.md, but the quick reference does not clearly warn that the skill performs persistent write operations. Users may invoke it assuming it is informational or low-risk, which makes accidental modification of agent behavior and stored state more likely.

Vague Triggers

Medium
Confidence
93% confidence
Finding
The documented trigger phrases are broad natural-language commands such as '启用 {skill-name}' and '设置 {skill-name} 技能', which can overlap with ordinary user requests and cause unintended invocation. In a meta-skill that performs filesystem and core-document modifications, accidental activation can lead to unauthorized or surprising changes to SOUL.md, AGENTS.md, MEMORY.md, or generated config files.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The README advertises automatic creation of directories, configuration files, and updates to core files, but it does not clearly warn users that these actions may alter persistent agent behavior and stored data. Because this is a setup meta-skill with write access to high-sensitivity files like SOUL.md and MEMORY.md, the lack of explicit change-impact warnings increases the risk of users authorizing destructive or hard-to-reverse modifications without informed consent.

Vague Triggers

Medium
Confidence
91% confidence
Finding
The example invocations use very broad natural-language phrases such as creating setup flows or initializing configuration for arbitrary skills. In a trigger-based agent environment, generic everyday wording increases the chance of accidental activation during normal conversation, which could cause unintended file creation or modification of core files like SOUL.md, AGENTS.md, and MEMORY.md.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The trigger phrases are broad, natural-language requests such as '设置 X 技能' and '启用 X 技能', which can overlap with ordinary conversation and unintentionally activate the skill. Because this skill then proceeds toward filesystem and core-document modification steps, accidental invocation can lead to unintended environment changes rather than a harmless response.

Vague Triggers

Medium
Confidence
92% confidence
Finding
The statement that 'for any skill' the same setup pattern should be followed creates unclear activation and applicability boundaries. In context, this is risky because different skills may have very different trust levels, setup requirements, or side effects, yet the document encourages uniform processing and modification without defining exclusions, safety checks, or unsupported cases.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill explicitly includes creating directories, creating configuration/state files, and updating core files like SOUL.md and AGENTS.md, but it does not require explicit user warning or confirmation before making those changes. This is more dangerous in context because the skill is framed as a standardized helper, which may normalize broad writes to persistent user state and agent behavior files without informed consent.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The skill explicitly instructs the agent to persist lessons, corrections, and memory entries immediately, but it does not require clear user notice or consent before storing potentially sensitive user-provided information. This creates a privacy risk because users may disclose preferences, corrections, or other personal data during normal interaction without understanding that it will be retained across sessions.

Natural-Language Policy Violations

Medium
Confidence
84% confidence
Finding
Setting Chinese as the default language without an explicit user choice is a consent and preference issue, especially in a system that also persists user preferences. While not a direct security exploit by itself, it can cause unwanted behavior and incorrect persistence of user settings that the user did not affirmatively select.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The template directs the agent to create directories and update files under ~/skills and core files like SOUL.md, AGENTS.md, and MEMORY.md without requiring an explicit user confirmation or warning that local files will be modified. In an agent setting, this can lead to unintended persistence or overwrite of user data and configuration, especially because the template is generic and applicable to any skill.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The automation script example performs mkdir and file writes directly, which normalizes unattended filesystem changes without surfacing a warning, dry run, or confirmation step. If adapted by an agent or user as-is, it could create or overwrite files in the user's environment and make persistent changes beyond the immediate task.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The trigger set is broad enough to match ordinary user requests about enabling or configuring a skill, which can cause this meta-skill to activate unintentionally and take over flows the user did not explicitly intend. In a workflow skill that can create files, update core files, and automate setup, accidental invocation increases the risk of unauthorized or confusing changes across other installed skills.

Ssd 3

Medium
Confidence
97% confidence
Finding
The document establishes an open-ended retention mechanism for corrections, preferences, and learned rules, but it does not define scope limits, sensitivity restrictions, or retention boundaries. That makes it easy for the system to accumulate personal, confidential, or task-specific data in natural-language logs that persist beyond the immediate purpose of collection.

Ssd 3

Medium
Confidence
96% confidence
Finding
The workflow instructs the agent to both store user corrections persistently and later search accumulated memory for future tasks, increasing the chance of cross-task disclosure or reuse of user-provided data outside its original context. This is dangerous because benign task inputs can become long-lived memory and then influence unrelated interactions or be surfaced back to the user unexpectedly.

Ssd 3

Medium
Confidence
94% confidence
Finding
The quick-reference commands encourage the system to remember, enumerate, and disclose learned user patterns, which normalizes broad retention and later exposure of user-provided information. Even absent malicious intent, this can reveal behavioral profiles, preferences, or sensitive corrections that users did not expect to be summarized or replayed.

Persistent Context Injection

Medium
Category
Memory Poisoning
Content
**How it works:**
1. User says "不对,应该是..." → record to corrections.md
2. Pattern repeats 3 times → ask to promote to permanent rule
3. Before tasks → load memory.md + relevant project/domain files
4. After corrections → write lessons immediately
Confidence
91% confidence
Finding
permanent rule

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal