Jose Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is not malware, but it can preserve and spread detailed session learnings into future agent context without enough consent, redaction, or scope controls.

Review carefully before installing in workspaces that may include secrets, customer data, private code, or proprietary plans. Keep .learnings local or gitignored by default, redact sensitive details, avoid global always-on hooks, and require human review before promoting any learning into agent prompt files or sharing it across sessions.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (16)

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The document claims the hook scripts only output text and do not run commands, but the examples configure them as command hooks that execute shell scripts. This mismatch can mislead users into underestimating the trust and permission boundary involved, increasing the chance they install and run privileged project scripts without proper review.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The document broadens a self-improvement/logging skill into managing persistent injected prompt files, workflow guidance, and cross-session coordination. That expands the trust boundary from local error capture into durable behavioral control and agent orchestration, which can be abused to steer future sessions or propagate unsafe instructions beyond the original purpose.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The promotion workflow instructs moving learnings into AGENTS.md, SOUL.md, and TOOLS.md, which are persistent prompt inputs that can shape future model behavior. This turns ordinary corrections or observations into durable system-like instructions without safeguards, creating prompt-injection persistence and unintended policy drift.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: Documenting transcript access, session spawning, and inter-session messaging as part of this skill extends it beyond self-improvement into lateral movement across sessions. Even if these are platform features, presenting them as part of the skill normalizes access to other session context that may contain sensitive data or instructions unrelated to the current task.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The activation guidance is very broad and covers many routine failures, corrections, and discoveries, so the skill may trigger in ordinary conversations where persistent logging is unnecessary. Over-broad activation increases the frequency of logging and promotion actions, which raises privacy and operational risk even if each individual action seems minor.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The 'Automatically log when you notice' section instructs the agent to persist data based on ambiguous natural-language cues, without requiring consent, sensitivity checks, or relevance thresholds. In practice, this can convert normal conversational content into durable records and trigger writes far more often than users expect.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: An empty matcher causes the hook to run on every prompt, creating a broad and persistent execution surface for a command hook. In this skill context, that means a local script is automatically invoked for all interactions, which magnifies risk from script bugs, later script changes, or unintended data exposure from prompt-driven context.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The user-level configuration combines a global install path with an empty matcher, causing automatic execution across all sessions rather than a single trusted project. This broadens blast radius substantially: any compromise, misconfiguration, or surprising behavior in the script affects every future session using the tool.

Vague Triggers

Low

Confidence: 84% confidence
Finding: Although presented as lower overhead, the minimal setup still uses an empty matcher, so the hook remains unconstrained and executes on every prompt. The impact is somewhat lower because fewer hooks are enabled, but the documentation still normalizes broad automatic command execution.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The Codex CLI example also uses an empty matcher, creating the same always-on trigger pattern in another agent environment. Recommending this across tools increases the chance of widespread adoption of an overly broad execution model.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The guidance encourages logging significant errors and learnings to workspace files but does not warn against storing sensitive prompts, secrets, API data, or user content. Persistent logging without redaction can convert transient sensitive context into durable local records that are later re-injected or exposed.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The skill encourages persistent logging of learnings and promotes sharing across sessions, but it provides no privacy minimization, retention limits, or data classification rules. That creates a real risk that user-provided details, internal project information, or sensitive operational context will be stored and reused beyond the original session.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The quick-reference and workflow guidance tell the agent to log user corrections, feature requests, failures, and external-tool details into persistent files as a default behavior. Because these categories often contain sensitive intent, proprietary context, or identifying information, automatic storage creates an avoidable data-retention and confidentiality risk.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The learning-entry template asks for 'Full context' and conversation-derived metadata, which strongly encourages storing more detail than is needed for future troubleshooting. If used as written, the logs may capture sensitive disclosures, internal file paths, or user-specific context that persists indefinitely.

Ssd 3

Medium

Confidence: 92% confidence
Finding: The feature-request format asks the agent to store what the user wanted and why they needed it, which can reveal strategy, operational needs, or sensitive business intent. Persisting that rationale without scoping or consent increases the chance of privacy leakage or unauthorized secondary use of user context.

Ssd 3

High

Confidence: 97% confidence
Finding: The skill explicitly describes tools for reading other sessions’ transcripts and sending learnings across sessions, enabling natural-language transfer of prior session data. Without strict access controls, minimization, and consent boundaries, this creates a significant confidentiality risk because sensitive content can move between contexts where it was not originally disclosed.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal