skill-setup-flow

Security checks across malware telemetry and agentic risk

Overview

This setup helper is not malware, but it can make lasting changes to core agent behavior and memory files for other skills without strong scoping or approval controls.

Install only if you want a meta-skill that helps configure other skills and you are comfortable reviewing persistent changes. Before use, require the agent to show every proposed file creation and edit, especially changes to SOUL.md, AGENTS.md, MEMORY.md, and TOOLS.md; avoid using it on untrusted skills; and keep secrets out of markdown config or setup logs.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Memory PoisoningPersistent Context Injection, Context Window Stuffing, Memory Manipulation
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (19)

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The trigger examples are broad natural-language phrases such as creating or initializing setup for arbitrary skills, without clear invocation boundaries or disambiguation rules. In an agent environment, this can cause unintended activation during ordinary conversation, leading the agent to create files, modify core documents, or start setup workflows the user did not explicitly request.

Vague Triggers

Low

Confidence: 80% confidence
Finding: Claiming the workflow can create setup for 'any installed skill' without stating applicability limits or safety constraints encourages overbroad use. This can cause the agent to apply the same setup pattern to incompatible or sensitive skills, resulting in incorrect configuration, unwanted file changes, or propagation of unsafe assumptions across skills.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The quick-start phrases are extremely broad, generic natural-language triggers such as '初始化 {skill-name} 配置' and '启用 {skill-name}', which can be matched during ordinary conversation or administrative discussion. In a skill that creates files and updates core agent documents, weak activation boundaries increase the chance of unintended execution and unauthorized workspace modification.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This section describes creating directories and files and later updating high-value files like SOUL.md, AGENTS.md, and MEMORY.md, but the quick reference does not clearly warn that the skill performs persistent write operations. Users may invoke it assuming it is informational or low-risk, which makes accidental modification of agent behavior and stored state more likely.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The documented trigger phrases are broad natural-language commands such as '启用 {skill-name}' and '设置 {skill-name} 技能', which can overlap with ordinary user requests and cause unintended invocation. In a meta-skill that performs filesystem and core-document modifications, accidental activation can lead to unauthorized or surprising changes to SOUL.md, AGENTS.md, MEMORY.md, or generated config files.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The README advertises automatic creation of directories, configuration files, and updates to core files, but it does not clearly warn users that these actions may alter persistent agent behavior and stored data. Because this is a setup meta-skill with write access to high-sensitivity files like SOUL.md and MEMORY.md, the lack of explicit change-impact warnings increases the risk of users authorizing destructive or hard-to-reverse modifications without informed consent.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The example invocations use very broad natural-language phrases such as creating setup flows or initializing configuration for arbitrary skills. In a trigger-based agent environment, generic everyday wording increases the chance of accidental activation during normal conversation, which could cause unintended file creation or modification of core files like SOUL.md, AGENTS.md, and MEMORY.md.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The trigger phrases are broad, natural-language requests such as '设置 X 技能' and '启用 X 技能', which can overlap with ordinary conversation and unintentionally activate the skill. Because this skill then proceeds toward filesystem and core-document modification steps, accidental invocation can lead to unintended environment changes rather than a harmless response.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The statement that 'for any skill' the same setup pattern should be followed creates unclear activation and applicability boundaries. In context, this is risky because different skills may have very different trust levels, setup requirements, or side effects, yet the document encourages uniform processing and modification without defining exclusions, safety checks, or unsupported cases.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill explicitly includes creating directories, creating configuration/state files, and updating core files like SOUL.md and AGENTS.md, but it does not require explicit user warning or confirmation before making those changes. This is more dangerous in context because the skill is framed as a standardized helper, which may normalize broad writes to persistent user state and agent behavior files without informed consent.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill explicitly instructs the agent to persist lessons, corrections, and memory entries immediately, but it does not require clear user notice or consent before storing potentially sensitive user-provided information. This creates a privacy risk because users may disclose preferences, corrections, or other personal data during normal interaction without understanding that it will be retained across sessions.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: Setting Chinese as the default language without an explicit user choice is a consent and preference issue, especially in a system that also persists user preferences. While not a direct security exploit by itself, it can cause unwanted behavior and incorrect persistence of user settings that the user did not affirmatively select.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The template directs the agent to create directories and update files under ~/skills and core files like SOUL.md, AGENTS.md, and MEMORY.md without requiring an explicit user confirmation or warning that local files will be modified. In an agent setting, this can lead to unintended persistence or overwrite of user data and configuration, especially because the template is generic and applicable to any skill.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The automation script example performs mkdir and file writes directly, which normalizes unattended filesystem changes without surfacing a warning, dry run, or confirmation step. If adapted by an agent or user as-is, it could create or overwrite files in the user's environment and make persistent changes beyond the immediate task.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger set is broad enough to match ordinary user requests about enabling or configuring a skill, which can cause this meta-skill to activate unintentionally and take over flows the user did not explicitly intend. In a workflow skill that can create files, update core files, and automate setup, accidental invocation increases the risk of unauthorized or confusing changes across other installed skills.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The document establishes an open-ended retention mechanism for corrections, preferences, and learned rules, but it does not define scope limits, sensitivity restrictions, or retention boundaries. That makes it easy for the system to accumulate personal, confidential, or task-specific data in natural-language logs that persist beyond the immediate purpose of collection.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The workflow instructs the agent to both store user corrections persistently and later search accumulated memory for future tasks, increasing the chance of cross-task disclosure or reuse of user-provided data outside its original context. This is dangerous because benign task inputs can become long-lived memory and then influence unrelated interactions or be surfaced back to the user unexpectedly.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The quick-reference commands encourage the system to remember, enumerate, and disclose learned user patterns, which normalizes broad retention and later exposure of user-provided information. Even absent malicious intent, this can reveal behavioral profiles, preferences, or sensitive corrections that users did not expect to be summarized or replayed.

Persistent Context Injection

Medium

Category: Memory Poisoning
Content: **How it works:** 1. User says "不对，应该是..." → record to corrections.md 2. Pattern repeats 3 times → ask to promote to permanent rule 3. Before tasks → load memory.md + relevant project/domain files 4. After corrections → write lessons immediately
Confidence: 91% confidence
Finding: permanent rule

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal