Self-Improving Agent (With Self-Reflection)

Security checks across malware telemetry and agentic risk

Overview

This skill is not malicious, but it creates persistent agent memory and steering that can affect future sessions more broadly than its controls clearly cover.

Install only if you want a persistent local memory system that can influence future agent behavior. Prefer Passive or Strict mode, review any AGENTS.md/SOUL.md/HEARTBEAT.md edits before applying them, inspect ~/self-improving/ regularly, and avoid storing secrets, health/financial data, private third-party details, or sensitive work context.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (11)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The setup directs the agent to modify broad steering files like AGENTS.md, SOUL.md, and optionally HEARTBEAT.md, which extends the skill’s influence beyond local self-reflection storage into persistent global behavior. Even if framed as non-destructive and quality-improving, this creates a persistence mechanism that can reshape future agent actions, routing decisions and memory handling across unrelated tasks without tight scoping or explicit per-change approval.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The 'When to Use' conditions are broad and subjective ('significant work', 'could be better', 'knowledge should compound over time'), which can cause the skill to activate during ordinary interactions and persist user-related information without clear consent boundaries. In a self-learning skill that writes to disk, ambiguous activation materially increases the chance of over-collection and retention of sensitive preference or correction data.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The phrase "forget everything" is an unscoped destructive trigger that can be invoked accidentally, quoted in discussion, or embedded in unrelated content, causing unintended data deletion. In a self-improving memory skill, that makes the issue more dangerous because the command directly affects persistent state across sessions.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The wipe procedure performs irreversible deletion immediately after export without requiring the user to acknowledge that the action will permanently remove learned data. This increases the risk of accidental or socially engineered loss of memory, especially in an agent designed to retain and use long-term context.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill explicitly describes preserving user preference history and archiving patterns long-term, but it does not warn about data retention, consent, minimization, or privacy boundaries. In a self-improving agent, this can lead to unintended accumulation of personal behavioral data across sessions or projects, increasing privacy, profiling, and cross-context leakage risks if the memory system is implemented broadly.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: The template instructs the agent to create directories and files under the user's home directory on first activation without any explicit user consent, warning, or confirmation step. While the actions are limited and not overtly destructive, silent filesystem modification violates least-astonishment and can create persistence or unwanted state on the host.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The trigger phrase "What do you know about X?" is broad natural language that can easily occur in normal conversation, making accidental invocation plausible. In a memory-management skill, unintended activation can cause the agent to access and summarize stored memory when the user may have meant a generic question, increasing privacy and scope-confusion risk.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: "Show my memory" is a short, conversational phrase that may be uttered casually and directly exposes persisted data. Because this skill stores historical preferences, patterns, and project information, accidental triggering could reveal sensitive or unexpected retained content to the user or in the wrong context.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Automatically loading memory files at session start means the skill accesses persisted user data without an explicit just-in-time notice or consent step. In a self-improving agent context, that increases the chance of invisible background processing of personal or project-specific data beyond what the user expects for a given session.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill persistently records user corrections, contexts, timestamps, and inferred namespaces into files without any user-facing consent, retention notice, or minimization safeguards. This creates a meaningful privacy risk because ordinary conversation can become a long-lived behavioral profile, and the self-improving design makes that collection systematic rather than incidental.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: Weekly background maintenance moves, archives, and compacts stored data across retention tiers without warning the user about ongoing processing and retention behavior. Even if intended for organization, silent lifecycle management of memory increases privacy risk and can preserve data longer than users realize.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal