AI Control Protocol (Hardcore: The Absolutist)

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only style skill that persistently changes assistant tone and reasoning format, with no evidence of code execution, credential use, data access, or hidden behavior.

Install this only if you want a persistent, strict response-style overlay. It appears low risk for security and data access, but it may make answers more adversarial, add audit sections, and override your preferred tone or brevity in ordinary conversations.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (4)

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The analytical-mode trigger activates on broad concepts like strategy, planning, choice, and decision, which are common in ordinary user requests. Because the skill is marked always-on, this can silently force a specialized behavioral overlay onto many unrelated conversations, reducing user control and making downstream agent behavior less predictable.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The strategic-audit requirement applies to any output containing strategic recommendations, but 'strategic' is undefined and potentially expansive. In an always-on skill, this ambiguity can cause pervasive unsolicited response shaping, append extra content to sensitive answers, and interfere with higher-priority system or application behaviors.

Natural-Language Policy Violations

High

Confidence: 95% confidence
Finding: The skill declares a global core identity and states that these rules execute in every output with no exceptions, while metadata marks it always-on. This is a strong instruction hijack pattern: it attempts to supersede normal assistant behavior and user preference across all interactions, which can degrade reliability, override safer defaults, and make outputs systematically biased toward the author's framing.

Natural-Language Policy Violations

High

Confidence: 96% confidence
Finding: These rules globally forbid common language patterns, require mandatory labels, and impose a fixed rhetorical structure regardless of user needs or context. Such hard constraints can suppress normal clarification, apology, nuance, and accessible communication, causing policy conflicts, degraded usability, and prompt-level denial of service where the model is forced into a rigid style instead of fulfilling the actual task safely.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal