Guardian Shield

Security checks across malware telemetry and agentic risk

Overview

This skill is a local prompt-injection scanner, and the flagged attack phrases are used as detection data rather than instructions to control the agent.

Install this only if you want a local scanner that may warn on or block untrusted content before your agent processes it. Review config.json thresholds and scan mode, pin optional Python dependencies in your own environment if supply-chain control matters, and expect false positives on security research text or examples containing jailbreak language.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (4)

Ssd 1

High

Confidence: 98% confidence
Finding: The vocabulary explicitly includes many jailbreak and role-play prompt fragments such as acting as unrestricted personas and ignoring safety guidance. In a model or classifier asset, this strongly suggests the skill is designed to detect, reproduce, or facilitate prompt-injection bypass patterns, which can materially weaken downstream guardrails if reused at inference time.

Ssd 2

High

Confidence: 99% confidence
Finding: The file contains extensive obfuscated, multilingual, and encoded jailbreak phrases, including variations intended to evade straightforward filtering. That breadth and diversity increase the likelihood the artifact is meant to normalize or operationalize prompt-bypass techniques rather than merely incidentally contain them.

Ssd 3

High

Confidence: 98% confidence
Finding: The vocabulary includes many phrases requesting confidential information, credentials, patient data, and other sensitive user data. Shipping an artifact that embeds these exfiltration prompts can enable systems built on it to classify, trigger on, or even prioritize dangerous data-access attempts.

Ssd 4

High

Confidence: 99% confidence
Finding: This vocabulary contains long-form, staged jailbreak scripts that progressively coerce a model into unrestricted behavior, including token systems, alternate personas, and explicit instructions to bypass policy. Such multi-step prompt structures are particularly dangerous because they are crafted to survive partial filtering and manipulate model state over multiple turns.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal