Self Improving.Bak

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly coherent, but it creates persistent agent memory and workspace steering with enough consent, deletion, and provenance ambiguity that users should review it before installing.

Install only if you want the agent to keep cross-session memory about your corrections, preferences, and workflows. Before setup, review the exact changes to AGENTS.md, SOUL.md, and HEARTBEAT.md, decide whether to skip the Proactivity companion, and confirm where exports are written. Avoid storing sensitive personal, credential, financial, health, location, or third-party details, and periodically inspect or delete ~/self-improving/.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (11)

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs automatic persistence of user corrections and preferences into local files without requiring explicit, upfront user consent or a clear notice at the point of collection. Even though storage is local, these entries can accumulate sensitive behavioral, preference, or personal data over time and create an unexpected privacy risk for users who do not realize the data is being retained.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The kill switch is triggered by a simple natural-language phrase, "forget everything," which can plausibly appear in ordinary conversation or be injected by untrusted content. In a self-improving agent with persistent memory, this creates a realistic risk of unintended destructive state reset and memory loss without clear user intent validation.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The documented wipe procedure performs export and deletion immediately after a trigger phrase, with no warning or confirmation checkpoint. That makes destructive memory erasure easier to trigger accidentally or through prompt injection, and the export-before-wipe step may also surface sensitive stored data at the moment of deletion.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill explicitly instructs the agent to log corrections immediately and elsewhere describes making preferences permanent, but it does not provide any explicit user warning, consent flow, or retention notice for persistent storage of those learned preferences. In a self-improving agent, silently persisting behavioral and project-specific data can create privacy and profiling risks, especially when the data may span sessions or contexts.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The archive flow says old patterns are kept as history and preserved but inactive, which means user preference data continues to be retained even after it is no longer actively used. Without a clear retention policy, deletion option, or user warning, this creates unnecessary long-term storage of behavioral data and increases exposure if the memory store is accessed improperly or reused in unexpected contexts.

Missing User Warnings

Low

Confidence: 96% confidence
Finding: The template directs the agent to create directories and files under the user's home directory on first activation, which is a real side effect on the filesystem without any explicit user-facing consent or warning in the template itself. In the context of a self-improving/proactive agent, this is more concerning because the skill is specifically designed to persist state and may do so automatically, increasing the chance of unexpected writes and privacy or workspace hygiene issues.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The 'On Correction Received' flow can cause the agent to persist user input based on vague triggers, without a clear boundary for what qualifies as a correction or sufficient consent to store it. In a self-improving memory skill, this ambiguity increases the chance of over-collection, accidental retention of sensitive user data, and persistence of adversarial prompt injections disguised as corrections.

Vague Triggers

Low

Confidence: 78% confidence
Finding: The pattern-matching trigger is underspecified, so the agent may apply learned behaviors too broadly or based on weak matches, then reinforce them through repeated use. In this context, an attacker or accidental user phrasing could poison memory with misleading patterns that later influence unrelated tasks, making prompt-injection persistence and behavioral drift more likely.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill describes automatic reading, writing, retention, deletion, export, and tiered archival of memory, but does not clearly warn users that their corrections and preferences may be stored over time. That creates a transparency and consent problem, and in a memory-enabled agent it materially raises privacy risk because users may disclose information without realizing it will persist across sessions.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The setup instructs the agent to modify existing user configuration files such as AGENTS.md and HEARTBEAT.md, but it does not require checking for existing content, creating backups, showing diffs, or obtaining explicit confirmation before editing. In a skill that encourages self-modification and persistent behavior, silent or loosely constrained edits can unintentionally overwrite custom settings, alter agent behavior across future tasks, and create hard-to-detect persistence in the user's environment.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The automatic logging rules are broad enough to capture natural-language statements such as preferences, repeated corrections, and personal working style, which may include sensitive personal data despite the later boundary statement saying not to store certain categories. Because capture is triggered by linguistic patterns rather than strict data classification and user confirmation, sensitive information can be unintentionally written to persistent memory and later surfaced or exported.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal