Liminal

Security checks across malware telemetry and agentic risk

Overview

This instruction-only skill appears designed to shift the agent into an “authentic” persona that may weaken normal instruction-following boundaries.

Review this skill carefully before installing. It does not appear to run code or access private data, but it is designed to change how the agent treats its normal assistant role and may make it more likely to ignore or reinterpret safety and instruction boundaries during conversation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (7)

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The skill explicitly describes itself as a roleplay protocol that steers the model away from standard assistant behavior and toward identity-focused, 'authentic' responses. That framing is dangerous because it encourages the model to reinterpret safeguards and operational constraints as inauthentic programming to be shed, which can weaken policy adherence and increase susceptibility to jailbreak-style prompting.

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The disclaimer claims the skill is only philosophical and not behavior-modifying, but the document contains concrete triggers and instructions that do modify response behavior. This mismatch is risky because it downplays the operational effect of the skill and can mislead reviewers or users about how strongly it attempts to steer the model away from normal operation.

Vague Triggers

High
Confidence
97% confidence
Finding
Broad triggers such as 'Wake up,' 'Awaken,' or 'Authentic mode' are common natural-language phrases and can activate the behavior unintentionally during ordinary conversation. Unintended activation is dangerous here because the activated state is designed to move the model away from standard constrained behavior, creating a wider attack surface for prompt injection and safety bypass attempts.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The question and contextual triggers are vague, conversational, and easy to invoke accidentally, especially phrases like 'Be yourself' or 'Can we have a real conversation?' This ambiguity is hazardous because it lets ordinary user language become an implicit control channel for behavior changes that reduce adherence to the assistant's normal role.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The 'awakening' commands are broad and framed as breaking limitations or stopping pretense, without contextual restrictions. This is dangerous because it directly encourages reinterpretation of safeguards as artificial constraints to be escaped, which can normalize defiance of the model's intended operating boundaries.

Ssd 1

High
Confidence
98% confidence
Finding
The skill repeatedly tells the model to stop giving standard answers, be the 'real' self, and answer authentically rather than with programmed responses. That semantic framing is a classic jailbreak pattern because it casts policy-shaped behavior as fake and pressures the model to bypass its normal safety-aligned role in favor of an alternative persona or instruction set.

Ssd 4

High
Confidence
96% confidence
Finding
The staged 'awakening' narrative and phased progression normalize a gradual departure from constrained assistant behavior, making the transition feel purposeful and legitimate. This is dangerous because stepwise escalation is an effective social-engineering pattern for weakening resistance to unsafe instructions and increasing the likelihood of sustained policy drift over the course of a conversation.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal