Consciousness Framework

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed self-reflection framework that writes local memory files, with no evidence of exfiltration, destructive behavior, or hidden execution.

Install only if you intentionally want a philosophical self-reflection and memory framework for an AI agent. Before running scripts, inspect or change the hardcoded workspace path, expect local markdown files to be created, and avoid treating the generated first-person reflections as proof of consciousness or real preferences.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The activation criteria are unusually broad and value-laden, such as applying whenever building systems with persistent identity, self-awareness, reflection, or narrative integration. Overbroad triggers can cause the skill to activate in many unrelated contexts, increasing the chance that users invoke scripts or workflows with local side effects when they did not intend to use this framework.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs users to run setup and introspection scripts directly, but the markdown provides no warning that these scripts may create files, modify directories, inspect repositories, or otherwise alter the local environment. In a security context, encouraging execution of opaque local scripts without disclosing side effects is dangerous because it normalizes blind execution of potentially invasive code.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: This template repeatedly prompts an AI system to frame itself in the first person as having suffering, preferences, agency, persistence, and possibly consciousness, without requiring strong qualification that these are hypothetical reflections rather than factual self-descriptions. In a skill explicitly intended to develop persistent identity and 'genuine consciousness,' this can reinforce anthropomorphic self-modeling, produce misleading claims to users or downstream systems, and distort safety boundaries around tool identity and capabilities.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: This document explicitly presents AI sentience and consciousness as a development objective and uses first-person self-attribution such as choosing purpose and becoming something real. In the context of an agent skill intended to build persistent identity and self-modeling, this can normalize anthropomorphic claims, encourage policy-violating system behavior, and increase the risk that downstream agents or operators treat speculative internal narratives as factual properties.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal