Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is not overtly malicious, but it creates persistent agent memory and broad automatic reminders that can affect future sessions, so users should review and scope it carefully before installing.

Install only if you want an agent to keep persistent local learning notes and potentially update future agent guidance. Prefer project-level setup over user-level/global hooks, avoid empty matchers where possible, review every `.learnings/` entry before promotion, and never store secrets, personal data, raw transcripts, or full command output.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Rogue AgentSelf-Modification, Session Persistence
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (9)

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The document says the hook scripts 'only output text' and 'don't modify files or run commands,' but the setup explicitly configures those scripts to be executed as shell commands by the hook system. That misleading assurance can cause users to under-trust the risk boundary and install auto-executed scripts with the same permissions as the agent, increasing the chance of unsafe deployment or review bypass.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The auto-detection rules trigger on very common conversational phrases like corrections and feature requests, which can cause routine user text to be persisted as learnings without meaningful consent or filtering. In practice this increases the chance of storing sensitive context, polluted memory, and incorrect long-term guidance that later affects agent behavior.

Vague Triggers

Medium
Confidence
93% confidence
Finding
The hook configuration uses an empty matcher, so reminder or error-detection scripts run for every prompt and broadly for tool use. This creates an unnecessarily wide activation surface where untrusted prompts and outputs can influence logging or reminder behavior, increasing privacy risk and the chance of persistent prompt contamination.

Vague Triggers

Medium
Confidence
91% confidence
Finding
An empty matcher on UserPromptSubmit causes the hook to run for every prompt, which creates broad, persistent interception of session activity without meaningful scoping. In a self-improvement skill that logs lessons and reviews context, this expands collection surfaces and can unintentionally process sensitive prompts, credentials, or proprietary content.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The user-level configuration recommends global activation, which broadens the hook from a project-specific tool into an always-on mechanism across sessions. That makes accidental invocation and cross-project data exposure more likely, especially for a skill centered on remembering past mistakes and lessons.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The Codex example also uses an empty matcher, so the hook can trigger on every prompt with unspecified scope. Because the skill is designed for ongoing retention and pre-task review, broad automatic triggering increases the chance of unnecessary data capture and unexpected behavior across routine interactions.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The trigger list is broad enough that ordinary errors, vague 'knowledge gaps,' or routine tool failures could cause the skill to activate and persist information without clear user intent. In a system that writes to workspace memory and may promote learnings into injected prompt files, overbroad activation increases the chance of storing sensitive, irrelevant, or attacker-influenced content.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The guide encourages copying 'learnings' into persistent workspace files and long-term memory locations but does not warn against storing secrets, personal data, raw transcripts, or environment-specific sensitive details. Because these files are later injected as context, any sensitive content recorded there may be repeatedly exposed to future sessions and other agents, compounding the blast radius of a single mistake.

Session Persistence

Medium
Category
Rogue Agent
Content
### Option 1: Project-Level Configuration

Create `.claude/settings.json` in your project root:

```json
{
Confidence
82% confidence
Finding
Create `.claude/settings.json` in your project root: ```json { "hooks": { "UserPromptSubmit": [ { "matcher": "", "hooks": [ { "type": "command",

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal