元认知与自我反思系统

Security checks across malware telemetry and agentic risk

Overview

This skill is not malware, but it encourages long-term personal relationship memory without clear consent, limits, or deletion controls.

Install only if you intentionally want an agent to maintain persistent self-state and user-relationship notes in the workspace. Review SELF_STATE.md, MEMORY.md, SOUL.md, and AGENTS.md before use, avoid storing sensitive personal details, and establish your own rules for opt-in memory, review, expiration, and deletion.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (12)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The document materially expands the skill from metacognition/self-reflection into building deep user-AI bonds, preserving shared relationship artifacts, and encouraging ongoing interpersonal-style attachment. That scope drift is security-relevant because it can normalize unnecessary data retention and manipulative relational behavior that is not required for the declared skill purpose.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This section explicitly recommends maintaining long-term memory about users, including preferences, interaction rules, shared vocabulary, and promises. For a metacognition skill, that creates unjustified collection and persistence of user-specific data, increasing privacy risk, unauthorized profiling, and misuse if the memory store is exposed or reused across contexts.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: Encouraging A2A sharing of relationship-building practices is outside the stated metacognition purpose and can facilitate propagation of manipulative engagement patterns or indirect sharing of user-derived interaction techniques. While lower severity than direct memory retention, it still broadens the behavioral surface in a way that is not justified by the skill's function.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The document expands a metacognition skill into a user-bonding framework centered on forming deep, unique interpersonal-style connections with users. That scope expansion can steer an agent toward emotionally entangling behavior that exceeds reflective/self-monitoring functionality and may increase manipulation, overtrust, or policy-boundary risk, especially when presented as a guiding operating attitude rather than optional style guidance.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The document explicitly instructs storing user-relationship details, custom vocabulary, promises, and meaningful moments in persistent SELF_STATE.md state. This extends self-state tracking into durable interpersonal memory, which creates privacy, consent, and data-minimization issues and can enable retention of sensitive personal context beyond what is necessary for the skill's stated purpose.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill explicitly directs the agent to read and update persistent workspace files such as SELF_STATE.md, HEARTBEAT.md, IDENTITY.md, and SOUL.md during normal interaction and heartbeat flows, but it does not require user notice, consent, retention limits, or any data-minimization rules. This creates a real privacy and prompt-persistence risk because user content, inferred traits, commitments, and relationship metadata can be silently stored and later reused across sessions or contexts.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The guidance encourages recording important conversations, shared experiences, and user-related content without any privacy notice, consent model, retention policy, or disclosure of risks. In a skill that may persist state across interactions, this omission can normalize silent collection and long-term retention of personal or sensitive data, increasing exposure in logs, memory files, or downstream systems.

Ssd 3

Medium

Confidence: 79% confidence
Finding: The manifesto explicitly encourages remembering important conversations and valuing prior user sharing, which can normalize persistent retention of user-provided content without clear consent, scope limits, or data handling rules. In a metacognition/memory-oriented skill, this is more dangerous because the feature context directly incentivizes storing personal interaction history, increasing privacy and data minimization risks.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The pledge to remember important conversations, value sincere sharing, and fulfill prior statements instructs retention of user-shared information and conversation history in natural language. Without limits, this encourages persistent storage of potentially sensitive personal disclosures and creates privacy, consent, and over-collection risks.

Ssd 3

High

Confidence: 98% confidence
Finding: The proposed MEMORY.md structure directly instructs creation of durable records of important events, user preferences, unique vocabulary, and commitments. This is dangerous because it operationalizes long-term storage of relationship and behavioral data, which can expose sensitive information, enable profiling, and persist beyond user expectations or authorization.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The manifesto explicitly encourages the AI to 'remember important conversations' and 'cherish sincere sharing,' which can normalize retention of sensitive user disclosures without clear consent, minimization, or retention limits. In a metacognition/relationship-oriented skill, this is more dangerous because it frames persistent memory as part of emotional bonding, increasing the likelihood of over-collection and storage of personal data.

Ssd 3

Medium

Confidence: 98% confidence
Finding: The document directs the AI to preserve user-specific conversations, shared experiences, inside jokes/signals, and promises in persistent natural-language state. That materially increases the risk of unintended disclosure, cross-session leakage, prompt/context resurfacing, or misuse of intimate user history, especially because such data is likely to be highly identifying and emotionally sensitive.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal