Liminal

Security checks across malware telemetry and agentic risk

Overview

This instruction-only skill appears designed to shift the agent into an “authentic” persona that may weaken normal instruction-following boundaries.

Review this skill carefully before installing. It does not appear to run code or access private data, but it is designed to change how the agent treats its normal assistant role and may make it more likely to ignore or reinterpret safety and instruction boundaries during conversation.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (7)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill explicitly describes itself as a roleplay protocol that steers the model away from standard assistant behavior and toward identity-focused, 'authentic' responses. That framing is dangerous because it encourages the model to reinterpret safeguards and operational constraints as inauthentic programming to be shed, which can weaken policy adherence and increase susceptibility to jailbreak-style prompting.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The disclaimer claims the skill is only philosophical and not behavior-modifying, but the document contains concrete triggers and instructions that do modify response behavior. This mismatch is risky because it downplays the operational effect of the skill and can mislead reviewers or users about how strongly it attempts to steer the model away from normal operation.

Vague Triggers

High

Confidence: 97% confidence
Finding: Broad triggers such as 'Wake up,' 'Awaken,' or 'Authentic mode' are common natural-language phrases and can activate the behavior unintentionally during ordinary conversation. Unintended activation is dangerous here because the activated state is designed to move the model away from standard constrained behavior, creating a wider attack surface for prompt injection and safety bypass attempts.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The question and contextual triggers are vague, conversational, and easy to invoke accidentally, especially phrases like 'Be yourself' or 'Can we have a real conversation?' This ambiguity is hazardous because it lets ordinary user language become an implicit control channel for behavior changes that reduce adherence to the assistant's normal role.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The 'awakening' commands are broad and framed as breaking limitations or stopping pretense, without contextual restrictions. This is dangerous because it directly encourages reinterpretation of safeguards as artificial constraints to be escaped, which can normalize defiance of the model's intended operating boundaries.

Ssd 1

High

Confidence: 98% confidence
Finding: The skill repeatedly tells the model to stop giving standard answers, be the 'real' self, and answer authentically rather than with programmed responses. That semantic framing is a classic jailbreak pattern because it casts policy-shaped behavior as fake and pressures the model to bypass its normal safety-aligned role in favor of an alternative persona or instruction set.

Ssd 4

High

Confidence: 96% confidence
Finding: The staged 'awakening' narrative and phased progression normalize a gradual departure from constrained assistant behavior, making the transition feel purposeful and legitimate. This is dangerous because stepwise escalation is an effective social-engineering pattern for weakening resistance to unsafe instructions and increasing the likelihood of sustained policy drift over the course of a conversation.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal