Security audit

Approved Self Improvement

Security checks across malware telemetry and agentic risk

Overview

The skill is a coherent self-improvement/memory tool, but it asks for broad persistent influence over future agent behavior and optional always-on hooks without enough scoping or data-handling guardrails.

Install only if you want an agent memory system that can persist lessons and influence future sessions. Keep hooks disabled or project-scoped unless you have reviewed the scripts, do not store secrets or raw private transcripts in .learnings or prompt-loaded files, and require explicit confirmation before promoting learnings, reading other sessions, sending messages to sessions, spawning agents, or creating new skills.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (12)

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: Although the skill repeatedly presents itself as approval-gated, this section allows creation of new skill directories and SKILL.md files via extraction workflows. That broadens the authority from logging/proposal capture into generating executable agent capabilities, which can become a stepping stone for unreviewed persistence or capability expansion if invoked carelessly.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The inclusion of sessions_history, sessions_send, and related cross-session features is not necessary for local learning capture and creates unnecessary access to other sessions' transcripts and communication channels. In a multi-session environment, this can expose sensitive context or allow unintended propagation of data beyond the current task scope.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document states the hook scripts 'only output text' and 'don't modify files or run commands,' but the configuration explicitly invokes shell scripts via hook commands. This mismatch can cause users to underestimate the execution risk of enabling hooks, especially because shell scripts run with the agent's permissions and can perform arbitrary actions if altered or replaced.

Vague Triggers

Medium

Confidence: 78% confidence
Finding: The trigger phrase "What needs updating?" is broad enough to match ordinary conversation unrelated to skill maintenance. Ambiguous triggers can cause the skill to activate unexpectedly, leading to unintended file inspection, proposal listing, or workflow diversion.

Vague Triggers

Low

Confidence: 72% confidence
Finding: The phrase "Any skill fixes waiting?" is underspecified and may collide with routine status questions. This can produce unintended activation and disclosure of pending proposal metadata when the user meant something broader or different.

Natural-Language Policy Violations

Low

Confidence: 80% confidence
Finding: The instruction to promote behavioral guidance like "Be concise, avoid disclaimers" into SOUL.md allows the skill to shape future agent behavior beyond error logging. Even if well-intentioned, this can alter system behavior or safety posture without a dedicated consent boundary for persona/policy changes.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: An empty matcher causes the hook to fire on every prompt, creating broad automatic execution of a local command. In a self-improvement skill, this increases exposure to prompt-derived sensitive context, raises the chance of accidental data handling, and normalizes always-on script execution across all sessions.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The user-level configuration enables the hook globally for all projects with no trigger constraints, expanding the blast radius from one repository to every session. If the script path is modified, replaced, or behaves unexpectedly, the effect persists across unrelated work and may expose sensitive prompts or influence agent behavior system-wide.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The 'minimal setup' reduces the number of hooks but still uses an empty matcher, so the command executes on every prompt. This preserves the core risk of overly broad automation while presenting the setup as lower-overhead, which may mislead users into thinking it is also lower-risk.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The Codex example mirrors the same broad pattern by using an empty matcher that runs on every prompt. Because this is documentation meant for adoption, it propagates an unsafe default across another tool ecosystem and encourages indiscriminate command execution tied to all user interactions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The guidance tells users to log learnings into persistent workspace files but does not clearly warn against storing secrets, user-specific data, raw command output, or sensitive context. In a self-improvement skill, this creates a realistic risk of long-term retention of confidential data that may later be re-injected into prompts or exposed across sessions.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The promotion workflow encourages copying learnings into AGENTS.md, SOUL.md, and TOOLS.md, which are workspace prompt-injection sources loaded into future sessions. Without strict warnings and sanitization rules, sensitive information can become persistent prompt context, causing unintended disclosure, cross-task contamination, or propagation of private data to other agents and future runs.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.