self_improving_agent

Security checks across malware telemetry and agentic risk

Overview

The skill is not malicious, but it creates broad persistent agent memory and optional always-on hooks without enough privacy and scoping guardrails.

Install only if you intentionally want persistent self-improvement notes and agent reminders. Before enabling hooks, prefer project-level setup over global user-level hooks, narrow matchers where possible, review the scripts, and avoid storing secrets, tokens, customer data, raw transcripts, private paths, or sensitive command output in learnings or promoted instruction files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The document's security section materially understates risk by claiming the scripts only output text and do not run commands, while the same guide configures them as command hooks and separately instructs users to execute another shell script directly. This contradiction can mislead users into granting trust to automatically executed local scripts without understanding that shell commands are being invoked on hook events.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly encourages reading other sessions' transcripts and sending learnings across sessions without any privacy boundary, consent check, or minimization rule. In a multi-session agent environment, that can expose unrelated user data, confidential prompts, or sensitive operational context across otherwise separate conversations.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The template instructs authors to write descriptions with trigger conditions, but it does not enforce concrete, bounded activation criteria. In an agent-skill ecosystem, vague triggers can cause a skill to activate in unintended contexts, which increases the chance of unsafe actions or overbroad guidance being applied without sufficient user intent.

Vague Triggers

Low

Confidence: 84% confidence
Finding: The scripts-oriented template encourages inclusion of executable helpers but does not require warnings, trust boundaries, or explicit conditions for when scripts may be run. In practice, this can normalize skills that expose automation without documenting risks, preconditions, side effects, or safe verification steps, making misuse more likely.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The setup instructions tell users to register shell scripts as automatic hooks on prompt submission and post-tool events, but they do not prominently warn that this causes command execution every time those events fire. In a self-improvement skill, this is more sensitive because it creates persistent, automatic execution tied to routine agent activity, increasing the chance users enable it without reviewing the scripts' trust boundary.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The guide recommends user-level activation in ~/.claude/settings.json without a nearby warning that this enables the hook across all sessions and projects. That broad persistence meaningfully increases blast radius: any future session can trigger the script automatically, making accidental trust of a local script much more dangerous.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The skill normalizes persistent retention of learnings, corrections, and context into markdown and memory/prompt files, but it provides no data-classification or redaction safeguards. This creates a durable leakage path where sensitive user content can be stored, resurfaced later, or propagated into agent instruction files.

Ssd 3

Medium

Confidence: 96% confidence
Finding: Directing the agent to record user corrections and feature requests into persistent files can capture proprietary requirements, personal data, or security-sensitive details exactly as provided. Because the logging is framed as routine and immediate, the risk of over-collection and later disclosure is materially increased.

Ssd 3

Medium

Confidence: 94% confidence
Finding: Cross-session transcript reading and message passing materially increase the chance of context mixing between users, tasks, or trust domains. Without clear authorization and sanitization requirements, the feature can disclose prior conversation contents to sessions that should not receive them.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The prescribed logging templates ask for full context, command inputs, environment details, error text, and user context—exactly the kinds of fields that commonly contain secrets, credentials, tokens, file paths, customer data, or internal system details. Persisting this material in markdown files creates a straightforward data-retention and accidental-disclosure risk.

Ssd 3

Medium

Confidence: 93% confidence
Finding: Automatically logging user-provided knowledge the agent lacked creates a standing retention policy for arbitrary user-supplied information. Because there is no sensitivity gate, the agent may preserve confidential or personal facts simply because they were novel, then reuse or promote them later.

Session Persistence

Medium

Category: Rogue Agent
Content: └── FEATURE_REQUESTS.md ``` ### Create Learning Files ```bash mkdir -p ~/.openclaw/workspace/.learnings
Confidence: 88% confidence
Finding: Create Learning Files ```bash mkdir -p ~/.openclaw

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal