Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is not malicious, but it asks agents to create persistent memory, run optional always-on hooks, and share or promote broad conversation-derived context without enough safeguards.

Install only if you want an agent memory workflow. Before enabling it, decide where `.learnings/` should live, avoid global always-on hooks unless you really need them, review entries before promotion, and redact secrets, tokens, customer data, personal information, raw transcripts, and sensitive command output.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (16)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The skill’s declared purpose is limited to logging learnings, but the document also instructs agents to install hooks, write to multiple persistent context files, use inter-session tooling, and extract entirely new skills. That scope expansion matters because users or operators may grant it more trust than warranted, leading to unexpected persistence, broader prompt influence, and filesystem modification beyond simple note-taking.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: Creating new reusable skills from logged learnings goes beyond passive self-improvement and becomes capability amplification. If the source learning contains flawed, sensitive, or adversarially influenced content, the extraction flow can persist and operationalize it as a reusable instruction set, increasing blast radius.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: Documenting inter-session messaging, transcript access, and sub-agent spawning under a self-improvement skill unnecessarily broadens authority and data reach. In context, these features can spread conversation-derived information across sessions or delegate tasks without clear necessity for simple learning capture.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The document's security section inaccurately assures readers that the scripts 'only output text' and 'don't modify files or run commands,' while the entire setup config explicitly invokes shell scripts via hook commands. This kind of misleading safety claim can cause users to under-trust the risk of installing automatic command execution hooks and make unsafe deployment more likely.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill instructs agents to create and update local files as part of normal operation without emphasizing consent, scope limits, or potential impact on project/workspace data. In agent environments, silent writes to durable files can alter repositories, leak context into tracked files, or create persistence the user did not intend.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: Using an empty matcher causes the hook to trigger on every prompt, which broadens the activation surface and ensures unscoped automatic execution throughout the session. In this skill's context, that increases exposure to unintended data capture, prompt-context injection, and repeated execution of local scripts without task-specific need.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The user-level configuration combines global installation with an empty matcher, causing the script to run across all sessions and projects. This expands persistence and blast radius beyond a single repository, making accidental leakage, unwanted prompt modification, or abuse of a compromised local script more impactful.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: Although labeled 'minimal,' this example still uses an empty matcher, so it remains always-on for every prompt. Reducing the number of hooks lowers overhead but does not address the core security concern of indiscriminate command execution on all submissions.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The Codex example repeats the same broad activation pattern by using an empty matcher, normalizing unbounded hook execution across another agent platform. This increases the chance that users copy insecure defaults into multiple environments, multiplying exposure.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The document explicitly instructs promoting learnings from `.learnings/` into persistent workspace files such as `SOUL.md`, `TOOLS.md`, and `AGENTS.md` without any guidance on sanitizing secrets, minimizing retention, or avoiding cross-session/private data. Because these files are injected into future sessions, sensitive data, user corrections, API errors, or internal context can become durable prompt context and spread beyond the original scope.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The skill encourages using transcript history and sending learnings across sessions without any minimization or sensitivity screening. That can propagate secrets, personal data, internal discussions, or other sensitive context into additional locations and sessions, making accidental disclosure and retention more likely.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The logging format explicitly asks for full context, inputs, parameters, environment details, and user context in durable markdown files. Those fields are highly likely to capture secrets, tokens, proprietary prompts, customer data, or other sensitive operational details that then persist on disk.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The guidance to 'promote aggressively' into persistent context files increases the chance that conversation-derived data or unsafe instructions get copied into long-lived agent memory files. Once promoted, the content can affect future sessions broadly and be much harder to notice or remove.

Session Persistence

Medium

Category: Rogue Agent
Content: └── FEATURE_REQUESTS.md ``` ### Create Learning Files ```bash mkdir -p ~/.openclaw/workspace/.learnings
Confidence: 82% confidence
Finding: Create Learning Files ```bash mkdir -p ~/.openclaw

Session Persistence

Medium

Category: Rogue Agent
Content: ### Option 1: Project-Level Configuration Create `.claude/settings.json` in your project root: ```json {
Confidence: 80% confidence
Finding: Create `.claude/settings.json` in your project root: ```json { "hooks": { "UserPromptSubmit": [ { "matcher": "", "hooks": [ { "type": "command",

Session Persistence

Medium

Category: Rogue Agent
Content: openclaw hooks enable self-improvement ``` ### 3. Create Learning Files Create the `.learnings/` directory in your workspace:
Confidence: 86% confidence
Finding: Create Learning Files Create the `.learnings/` directory in your workspace: ```bash mkdir -p ~/.openclaw/workspace/.learnings ``` Or in the skill directory: ```bash mkdir -p ~/.openclaw

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal