Self Improving 1.2.10

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local memory system for agent self-improvement, with real privacy considerations but no evidence of hidden, destructive, or exfiltrating behavior.

Install only if you want the agent to keep local cross-session memory about corrections, preferences, and work patterns. Review any AGENTS.md, SOUL.md, or HEARTBEAT.md changes before applying them, avoid storing secrets or sensitive personal data, periodically inspect ~/self-improving/, and confirm whether you want an export before using any full-wipe command.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (11)

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The documented commands implement broad persistent user-memory management, including search, display, export, and deletion of stored data, which materially exceeds the stated self-reflection purpose of the skill. This creates unnecessary data-retention and data-exposure capability, increasing privacy and misuse risk if the agent stores sensitive user content or if these commands are triggered unexpectedly.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The weekly maintenance workflow defines autonomous lifecycle management of persistent stored data across HOT/WARM/COLD tiers, archival, compaction, and digest generation. That is a real expansion from reflective assistance into long-term user data stewardship, which increases privacy, compliance, and unintended-retention risks even if the feature is not overtly malicious.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The activation guidance is intentionally broad enough to run before starting work and after many completed tasks, which can cause the skill to engage during ordinary conversations and repeatedly read/write persistent memory. In a self-learning skill, over-triggering increases the chance of logging unnecessary or sensitive user statements, amplifying privacy and prompt-injection risk even if the stated purpose is benign.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The trigger phrase "forget everything" is broad enough that it could appear in normal conversation, quoted text, or hypothetical discussion, causing unintended execution of a destructive memory-wipe workflow. In a self-improving agent that persists memory across sessions, accidental activation can erase state and disrupt behavior or user data handling unexpectedly.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The kill-switch workflow instructs the agent to export current memory to a file before deletion, which creates a new copy of sensitive accumulated data at the exact moment the user is requesting removal. Without explicit warning, consent, destination controls, retention limits, and access protections, this undermines the deletion request and increases the risk of privacy exposure or unauthorized recovery.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The skill defines automatic learning behaviors such as logging corrections, promoting preferences, and writing to scoped namespaces, but it does not provide any user-facing notice or consent mechanism for storing behavioral data. In a self-improving agent, this increases privacy risk because users may unknowingly reveal persistent preferences, project details, or communication habits that are retained beyond the immediate interaction.

Missing User Warnings

Low
Confidence
88% confidence
Finding
The template instructs the agent to create directories and files in the user's home directory automatically on first activation, without any explicit confirmation, preview, or warning. Even though the commands are simple and not overtly destructive, silent filesystem modification is a risky behavior for an agent skill because it normalizes unprompted persistence and could create privacy, trust, or policy issues in constrained environments.

Vague Triggers

Medium
Confidence
88% confidence
Finding
Several trigger phrases are plain-language commands such as "Forget X," "Export memory," and "What do you know about X?" that can plausibly overlap with ordinary conversation. If interpreted automatically, they could cause unintended disclosure, deletion, or retrieval of stored memory without a clear confirmation boundary.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The file describes persistent writes, exports, and full-memory deletion flows without clear user-facing warning or consent language about privacy consequences. Users may not understand that their content is being durably retained, reorganized, exported, or wiped, which is especially risky for a skill presented as self-improvement rather than memory administration.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The setup instructs the agent to persistently write self-generated corrections, preferences, and inferred rules into files under `~/self-improving/`, including doing so before final responses. This creates a durable cross-task memory channel without requiring explicit user consent, review gates, or privacy/safety constraints, which can cause sensitive data retention, prompt-injection persistence, and propagation of incorrect or adversarial lessons across future tasks.

Ssd 3

Medium
Confidence
94% confidence
Finding
The documented commands allow displaying memory contents and exporting retained history in plain language, while example files show detailed preferences, project conventions, and correction history. Without stated access restrictions, redaction, or authorization checks, this creates a straightforward path to exposing sensitive stored user context.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal