Structured Falsification

Security checks across malware telemetry and agentic risk

Overview

This is a markdown-only decision-analysis skill that may change response style broadly, but it does not run code, access data, use credentials, or persist behavior.

Install this only if you want the assistant to automatically use a concise falsification-style table for complex decisions. For financial, business, or architecture topics, ask for evidence, assumptions, caveats, and your preferred language when the default concise Chinese-formatted output is not enough.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (3)

Vague Triggers

High

Confidence: 94% confidence
Finding: The auto-trigger scope is broad enough to activate on many normal, ambiguous requests, which can silently override expected agent behavior and insert a hidden reasoning/output policy. In a decision-support skill, unexpected activation can bias responses, suppress normal transparency, and cause the model to follow this skill even when the user did not intend to invoke it.

Vague Triggers

High

Confidence: 96% confidence
Finding: Triggers like 'should I', 'which one', 'evaluate', and 'compare' are common across everyday conversations, so this skill may activate far outside its intended scope. Because the skill then imposes a restrictive output style and internal process, it can hijack routine interactions and alter agent behavior without clear user consent.

Natural-Language Policy Violations

Medium

Confidence: 88% confidence
Finding: Forcing a Chinese-only output format without checking user language preference can cause confusion, reduce usability, and create a form of instruction override unrelated to the user's request. In an agent skill, hardcoded locale constraints are risky because they can unexpectedly replace the agent's normal language and formatting behavior, especially when auto-triggering is already broad.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal