Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill appears purpose-aligned rather than malicious, but it automatically stores learning data and can execute workspace hook code with too little user control.

Install only if you intentionally want persistent self-improvement memory and trust the workspace hooks directory. Before enabling automatic mode, review all hook files, avoid using it on sensitive sessions unless you add redaction, keep auto-apply disabled where possible, and inspect or delete learning/export files before sharing or committing them.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (24)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 87% confidence
Finding: The skill describes persistent learning files and export behavior, implying file read/write capabilities, but it does not declare permissions or clearly scope those capabilities. This creates a transparency and consent problem: users may install a skill that can retain and write conversation-derived data without an explicit permission boundary.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The code loads every Python file from a workspace-controlled hooks directory and executes it via importlib, which runs module top-level code immediately and later invokes exported functions. If an attacker can place or modify files in that directory, they gain arbitrary code execution in the agent's trust boundary with no signature checks, allowlist, sandboxing, or consent step. In a self-improving agent context, this is especially dangerous because loading untrusted "learning" hooks may appear expected and therefore evade scrutiny.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README explicitly advertises 'automatic improvement' and later shows configuration such as AUTO_APPLY=true without warning that learned changes or hooks may alter agent behavior, outputs, or workspace state. In a self-modifying agent skill, this can mislead users into enabling behavior-changing automation without understanding the safety, review, or rollback implications.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README states that learning happens automatically after sessions, errors, and recoveries, but it does not clearly warn users that potentially sensitive interaction and error data will be persisted to local files. This creates a privacy and data-handling risk because users may unknowingly enable retention of prompts, preferences, stack traces, or other sensitive content.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The README explicitly says session learning derives conversation patterns and user preferences, but it does not present a user-facing warning or consent mechanism for collecting and storing that information. User preferences and conversational history can contain personal or sensitive information, so silently retaining them increases privacy and compliance risk.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The installation and usage text says auto-learning is enabled by default and occurs after each session, but it does not define precise trigger conditions, scope, or exclusions. Broad automatic activation increases the chance of processing sensitive sessions unexpectedly and makes user consent ambiguous.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The 'Automatic Mode' section instructs users to operate normally while learning happens automatically, which is vague and functionally broad. In a skill that stores interaction-derived data, unclear invocation language can lead to silent collection and processing beyond what users reasonably expect.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The skill advertises automatic learning from conversations, errors, and recoveries, but does not prominently warn users that conversation patterns, preferences, and error context may be stored persistently. Because error traces and session history often contain sensitive or secret data, omission of this warning materially increases privacy and data-handling risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The guide promotes fully automatic learning, automatic hook application, and persistent storage of derived learnings without emphasizing human review, consent, or change controls. In a self-modifying agent context, this can normalize unsafe state changes and persistent retention of potentially sensitive or low-quality data, increasing the chance of unintended behavior or disclosure.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The team-sharing example encourages emailing exported learnings derived from prior sessions, but it provides no warning that those exports may contain sensitive user prompts, operational details, secrets, or internal context. This creates a realistic path for accidental exfiltration of conversation-derived data to unintended recipients.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The guide promotes continuous learning from interactions, storage of learnings, and export/sharing of outputs, but provides no warning that session logs may contain sensitive prompts, credentials, personal data, or proprietary information. In a self-improving agent context, this omission can lead users to persist and redistribute sensitive session-derived data without informed consent or safeguards.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The documentation encourages fully automated application of improvements and hooks without warning that learned behaviors or hooks can alter runtime behavior and execute code-like changes automatically. In a self-modifying or extensible agent system, auto-apply creates a clear integrity and safety risk because flawed, poisoned, or unsafe learnings may be applied without human review.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The hook persists raw error messages and extracted context to local JSON files without any consent, redaction, retention controls, or visibility to the user. Error strings and tracebacks often contain secrets, prompts, filesystem paths, tokens, or sensitive business data, so writing them to disk can create a durable local disclosure surface that other processes or users may access.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The hook persists performance metrics and associated context to disk without any apparent minimization, consent, retention limit, or sensitivity filtering. Because the captured context includes fields like operation, timestamp, and duration—and the input metric is externally supplied—this can create an unintended audit trail of potentially sensitive operational data that may later be exposed through logs, backups, or local compromise.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The hook automatically writes session-derived metadata to disk with no consent flow, notice, retention control, or minimization. Even though the current fields are limited, they are still persistent behavioral metadata, and the design can easily expand to include more sensitive session content over time.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The module automatically imports hook files found in the workspace without any user-facing warning, confirmation, or opt-in, so code execution can occur merely by placing a .py file in the hooks directory. This removes an important friction/control point and makes accidental or malicious execution far more likely, especially in environments where the workspace may be synced, shared, or modified by other tools.

Ssd 3

Medium

Confidence: 88% confidence
Finding: Advertising continuous learning from every interaction and automatic improvement implies broad capture and retention of user-provided content. In an agent skill context, this is more dangerous because conversations may include secrets, tokens, internal documents, or operational details that should not be retained by default.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The session learning section says the system learns from conversation patterns and user preferences and stores learnings in files, which indicates persistent retention of behavioral and possibly sensitive user data. Without clear minimization and consent boundaries, this can expose private information through local files, backups, logs, or later exfiltration by other components.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The feature list promotes continuous learning, memory storage, and auto-improvement from interactions and recoveries, which indicates automatic retention of user-derived information. In context, this is dangerous because retained data may include sensitive conversational content and operational artifacts, and the export feature further broadens exposure.

Ssd 3

Medium

Confidence: 97% confidence
Finding: This section explicitly states that session learning stores conversation patterns and user preferences in persistent files. Persistent storage of behavior and preference data creates privacy risk, enables profiling, and may capture sensitive information if the session content is not filtered or minimized.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The export command writes all stored learnings to a markdown file, increasing the number of copies and the chance of accidental disclosure through filesystem access, sharing, or version control. Export amplifies the impact of any sensitive data already retained by making it easier to exfiltrate or mishandle.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The documentation describes analyzing session logs, extracting learned content, storing it persistently, and exporting it to shareable files, which creates a clear data-flow from conversations into durable artifacts. If prompts or logs contain credentials, personal data, or proprietary information, the system can unintentionally preserve and later expose that material.

Ssd 3

Medium

Confidence: 98% confidence
Finding: The example explicitly recommends emailing exported learnings to a team, turning stored conversation-derived memory into a transferable artifact. In the context of a self-improving agent that learns from interactions, this makes accidental leakage especially plausible because exports may aggregate sensitive insights from many sessions.

Ssd 3

Medium

Confidence: 90% confidence
Finding: This skill is explicitly designed for continuous learning and persistently records session attributes such as identifiers, duration, interactions, and success patterns. In an agent context, session objects often contain user-derived or operationally sensitive metadata, so saving these fields in plaintext creates privacy and data exposure risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal