Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is not malware, but it asks agents to persist conversation-derived learnings into future instruction files and across sessions with broad triggers and weak privacy controls.

Install only if you want agents to maintain durable learning notes and potentially update future instruction files. Keep hooks project-scoped, avoid global every-prompt activation, require manual review before writing to AGENTS.md, CLAUDE.md, SOUL.md, TOOLS.md, or Copilot instructions, and redact tokens, credentials, personal data, customer data, proprietary prompts, and raw transcript excerpts before anything is logged or shared.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 93% confidence
Finding: The skill advertises itself primarily as a logging/improvement aid, but the body also instructs users to install hooks, inject reminders into agent context, scan tool failures, and generate new skill scaffolds. This mismatch is security-relevant because operators may approve a seemingly harmless memory skill without realizing it changes runtime behavior and writes additional files.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The document states that the hook scripts 'only output text' and 'don't modify files or run commands,' but the same guide configures them as executable command hooks and separately instructs direct execution of another script. This creates misleading security guidance that can cause users to under-trust the execution risk of installed hooks, increasing the chance they enable code with insufficient review.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The guide expands a 'self-improvement' feature from local error logging into persistent modification of multiple injected prompt files such as AGENTS.md, SOUL.md, and TOOLS.md. Because these files affect future model behavior across sessions, writing to them creates a prompt-persistence channel that can silently alter agent policy beyond the original scope of the skill.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The documented ability to inspect other sessions and send them messages is broader than necessary for a self-improvement workflow and introduces unnecessary authority over cross-session state. If abused or triggered by poisoned inputs, the skill could read unrelated context or propagate manipulative content into other sessions, expanding blast radius and weakening isolation.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The logging instructions tell the agent to record detailed error output, inputs, parameters, and environment context, but provide no redaction rules for credentials, tokens, personal data, or proprietary content. Error logs and command inputs frequently contain secrets, so this creates a realistic secret-retention and later disclosure risk.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: Using an empty matcher causes the hook to fire on every prompt, which is an overly broad trigger scope for an automatically executed command. In a self-improvement skill, this broad activation increases exposure to unintended data capture, prompt-context injection, performance overhead, and repeated execution of local scripts without clear necessity.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The user-level configuration installs the hook globally, causing automatic execution across all repositories and sessions without project-specific scoping. That broadens the blast radius if the script is modified, replaced, or behaves unexpectedly, and it may expose unrelated work contexts to the hook output mechanism.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The Codex example also uses an empty matcher, so the command hook runs for all prompts with insufficiently defined activation conditions. In agent tooling, automatic command execution should be constrained as tightly as possible because every prompt becomes a trigger surface.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The document normalizes reading transcripts from other sessions without any privacy notice, consent requirement, or boundary explanation. That can expose sensitive user data, credentials, or proprietary context from unrelated conversations and makes accidental privacy violations more likely.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The skill encourages persistent logging of user corrections, requests, failures, and other conversation-derived details into markdown files. That creates a retention surface for sensitive natural-language content that may later be read by other agents, committed to repositories, or exposed through tooling.

Ssd 3

High

Confidence: 97% confidence
Finding: The inter-session features explicitly encourage listing sessions, reading transcript history, and forwarding learnings between sessions. This materially increases the chance that sensitive information from one task or user context is propagated into another session or agent without proper need-to-know controls.

Ssd 3

High

Confidence: 98% confidence
Finding: The prescribed log schemas ask for full context, actual error messages, input parameters, environment details, related files, and implementation notes. In practice, these fields are highly likely to capture credentials, internal paths, customer data, tokens, or proprietary prompts, turning the learning files into a sensitive data sink.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal