Skill Evolve Pro

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly purpose-aligned, but it can persistently rewrite skill files and send local skill/session data to DeepSeek with weak user controls and an exposed bundled API key.

Review before installing. Use only in a workspace where it is acceptable for skill files, failure traces, and SESSION-STATE-derived data to be processed by DeepSeek. Remove the bundled API key, set your own trusted endpoint, keep backups of target skills, and require a manual diff review before allowing any SKILL.md write.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (18)

Tainted flow: 'req' from os.environ.get (line 161, credential/environment) → urllib.request.urlopen (network output)

Critical

Category: Data Flow
Content: }, method="POST", ) with urllib.request.urlopen(req, timeout=120) as resp: result = json.loads(resp.read().decode("utf-8")) return result["choices"][0]["message"]["content"] except urllib.error.HTTPError as e:
Confidence: 96% confidence
Finding: with urllib.request.urlopen(req, timeout=120) as resp:

Intent-Code Divergence

Medium

Confidence: 87% confidence
Finding: The skill requires user confirmation before writing changes, but it separately introduces automatic parsing of SESSION-STATE.md to harvest failure trajectories without an equivalent consent gate. That creates a data-flow where potentially sensitive conversational state can be collected and repurposed for optimization without explicit user approval, undermining the stated safety model.

Intent-Code Divergence

Medium

Confidence: 82% confidence
Finding: The document claims a protected slow-update region is exempt from Step ⑤ edits, but the marker syntax is malformed and duplicated, making the protected boundary ambiguous. In an automated editing pipeline, unclear protection delimiters can cause unintended modification of supposedly immutable guidance or allow edits to slip into reserved sections.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The file claims user configuration is not uploaded, yet it contains a distributable hard-coded API key fallback. Embedding a live secret in source code exposes the credential to anyone with repository or package access and can enable unauthorized API use, billing abuse, or downstream compromise of connected services.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The function documentation and surrounding logic claim the gate validates that edits actually improved the skill, but the implementation can still mark edits as accepted when the improvement check is skipped, unavailable, or returns no per-edit failure. That creates a trust-boundary problem: downstream automation may rely on 'passed' as evidence of verified improvement even when no such verification occurred.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The code records API failure as a concern and says it is using a conservative strategy with partial rejection, but it does not actually reject edits on improvement-check failure. This mismatch can let unverified edits pass the gate, undermining the safety assurances of the validation stage and enabling risky or ineffective changes to be promoted.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The activation phrases are very broad and overlap with normal user language such as 'optimize this skill' or 'let the skill evolve automatically'. In an agent environment, vague triggers can cause unintended invocation of a capability that edits skill files, increasing the chance of unauthorized or accidental self-modification.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README states that the skill will automatically generate patches, apply edits, validate them, and output an evolved version, but it does not clearly warn users that local skill files may be modified. In a self-modifying tool, lack of explicit risk disclosure and consent is dangerous because users may trigger persistent changes without understanding rollback, review, or blast radius.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger phrase at this line is broad enough to overlap with ordinary user requests, increasing the chance that the self-modifying workflow activates when the user intended a generic improvement request. Because this skill can propose and eventually write changes to skill files, accidental activation raises the risk of unintended code or prompt modification flows.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: This trigger wording is ambiguous and maps to common workflow language, so unrelated user requests could be interpreted as authorization to start the evolution pipeline. In a self-editing skill, ambiguous invocation is dangerous because it can transition the agent into a higher-risk mode with access to internal files and modification logic.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The phrase on this line lacks specificity and could match normal conversation about running a process, making reliable activation control difficult. Given the skill's purpose is to analyze failures and generate patches, misfires could expose internal state or initiate unintended document changes after minimal prompting.

Missing User Warnings

High

Confidence: 100% confidence
Finding: Using os.environ.get with a hard-coded fallback secret means the application will silently operate with an embedded credential if no user-provided key is set. This is dangerous because it hides secret exposure, encourages unsafe distribution practices, and allows anyone obtaining the code to reuse the credential for unauthorized access and cost incurrence.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code packages task descriptions, failure reasons, predicted answers, and reference text from failed trajectories and sends them to an external LLM API. If those trajectories contain proprietary, personal, or otherwise sensitive data, this creates an unintended data disclosure channel, especially because there is no consent, redaction, classification check, or user-visible warning in the workflow.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The code sends portions of skill content, post-edit content, and rollout/failure trajectory data to an external DeepSeek API. If these materials contain proprietary instructions, sensitive operational details, or embedded secrets, this creates a confidentiality risk because data leaves the local trust boundary without explicit notice, redaction, or consent handling.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code constructs prompts that include the full skill document and rollout trajectory data and sends them to an external DeepSeek API. If those inputs contain secrets, proprietary content, personal data, or sensitive operational traces, this causes data exfiltration to a third-party service without any in-code minimization, redaction, or explicit disclosure boundary.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The script unconditionally overwrites SKILL.md with LLM-generated content, which can permanently alter a sensitive skill file without human review. Because the content comes from an external model and includes injected guidance, this creates integrity risk and can enable prompt-injection persistence or accidental corruption of the skill definition.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The code transmits SKILL.md contents, prior guidance, trajectory stats, and edit history to an external API. If SKILL.md or state data contains secrets, proprietary logic, or untrusted prompt content, this causes data exfiltration to a third party and expands the attack surface to external prompt-manipulation effects.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The trigger list contains broad phrases such as "evolve" and generic requests like "优化这个技能", which can cause accidental or unintended activation in unrelated conversations. In a skill that performs automated evolution of other skills, unintended invocation increases the chance of unauthorized modification, unsafe execution paths, or confusing the agent into self-modifying behavior without clear user consent.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal