Self-Improvement Agent

Security checks across malware telemetry and agentic risk

Overview

This self-improvement skill is not malicious, but it gives agents broad persistent memory and cross-session capabilities without enough scoping or privacy guardrails.

Install only if you intentionally want persistent agent learning. Keep hooks project-scoped with narrow matchers, review any changes before writing to SOUL.md, AGENTS.md, TOOLS.md, CLAUDE.md, or similar prompt files, and avoid storing raw transcripts, credentials, tokens, personal data, or full command output. Do not use cross-session history, send, or spawn features unless the user has explicitly approved that specific transfer.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (14)

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: Cross-session messaging and background sub-agent spawning are materially more powerful than simple learning capture and are not necessary for the stated function. These features can spread data and instructions beyond the current task boundary, increasing the risk of privacy leaks, unintended autonomy, and hard-to-audit propagation of bad or adversarial guidance.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: Directing promotion of observed patterns into files like `SOUL.md`, `AGENTS.md`, `TOOLS.md`, and `CLAUDE.md` turns transient observations into persistent behavioral control. In context, this is more dangerous because those files can influence future agent behavior broadly, so an incorrect, malicious, or privacy-sensitive learning can become durable prompt injection or policy drift.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The document's security section is internally inconsistent: it says the hook scripts only output text and do not run commands, while the rest of the file explicitly configures those scripts as command hooks and references an extraction script that creates files. This can mislead users into granting trust or broad deployment under false assumptions, increasing the chance of unsafe execution in the agent's privilege context.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The skill does not clearly distinguish between conditions that should merely inform agent judgment and conditions that should automatically trigger logging behavior. This ambiguity can lead to over-collection, unnecessary persistence, and inconsistent behavior that is difficult for users to predict or audit.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The skill does not clearly distinguish between conditions that should merely inform agent judgment and conditions that should automatically trigger logging behavior. This ambiguity can lead to over-collection, unnecessary persistence, and inconsistent behavior that is difficult for users to predict or audit.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill instructs reading other sessions' transcripts without any user-facing privacy notice or consent boundary. Session histories often contain sensitive user data, credentials, business context, or prior instructions, so cross-session access can create unauthorized data exposure and inappropriate reuse of personal or proprietary information.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The logging format explicitly tells the agent to paste raw error output and environment details, which frequently contain secrets, tokens, internal paths, hostnames, stack traces, or user data. Because the logs are persisted, this can convert transient sensitive data exposure into durable at-rest leakage that may later be read, indexed, or propagated.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: Using an empty matcher causes the hook to fire on every prompt submission, creating an always-on execution path for a local command. In a self-improving agent context, that broad trigger surface increases the chance of unintended data exposure, prompt-context injection, or persistent behavioral influence across all sessions.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The user-level setup enables global activation without meaningful trigger constraints, causing the hook to run across projects and sessions. That persistence makes the skill more dangerous because any flawed or malicious script behavior would affect all agent interactions, not just one repository.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: Although labeled as minimal setup, the example still uses an empty matcher, so it remains broadly active on every prompt. Reducing the number of hooks lowers overhead but does not reduce the core risk of unconditional command execution and context influence.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The Codex CLI example repeats the same broad empty matcher pattern, enabling execution on any prompt. In agent tooling, cross-platform reproduction of an unsafe default amplifies exposure by normalizing unconditional hooks as the expected configuration.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The empty `matcher` in the `UserPromptSubmit` hook causes the activator script to run for every user prompt, not just prompts related to self-improvement or error handling. In this skill’s context, that broad trigger increases the chance of unnecessary script execution, unintended data capture from unrelated prompts, and expansion of the skill’s effective scope beyond its stated purpose.

Ssd 3

Medium

Confidence: 88% confidence
Finding: Cross-session transcript review combined with learning capture creates a pipeline for collecting, summarizing, and redistributing natural-language user data across contexts. This is dangerous because it can propagate sensitive information beyond the original conversation boundary and embed it into long-lived memory or workspace files without clear authorization.

Session Persistence

Medium

Category: Rogue Agent
Content: ### Option 1: Project-Level Configuration Create `.claude/settings.json` in your project root: ```json {
Confidence: 84% confidence
Finding: Create `.claude/settings.json` in your project root: ```json { "hooks": { "UserPromptSubmit": [ { "matcher": "", "hooks": [ { "type": "command",

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal