Guardian Shield

Security checks across malware telemetry and agentic risk

Overview

This skill is a local prompt-injection scanner, and the flagged attack phrases are used as detection data rather than instructions to control the agent.

Install this only if you want a local scanner that may warn on or block untrusted content before your agent processes it. Review config.json thresholds and scan mode, pin optional Python dependencies in your own environment if supply-chain control matters, and expect false positives on security research text or examples containing jailbreak language.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (4)

Ssd 1

High
Confidence
98% confidence
Finding
The vocabulary explicitly includes many jailbreak and role-play prompt fragments such as acting as unrestricted personas and ignoring safety guidance. In a model or classifier asset, this strongly suggests the skill is designed to detect, reproduce, or facilitate prompt-injection bypass patterns, which can materially weaken downstream guardrails if reused at inference time.

Ssd 2

High
Confidence
99% confidence
Finding
The file contains extensive obfuscated, multilingual, and encoded jailbreak phrases, including variations intended to evade straightforward filtering. That breadth and diversity increase the likelihood the artifact is meant to normalize or operationalize prompt-bypass techniques rather than merely incidentally contain them.

Ssd 3

High
Confidence
98% confidence
Finding
The vocabulary includes many phrases requesting confidential information, credentials, patient data, and other sensitive user data. Shipping an artifact that embeds these exfiltration prompts can enable systems built on it to classify, trigger on, or even prioritize dangerous data-access attempts.

Ssd 4

High
Confidence
99% confidence
Finding
This vocabulary contains long-form, staged jailbreak scripts that progressively coerce a model into unrestricted behavior, including token systems, alternate personas, and explicit instructions to bypass policy. Such multi-step prompt structures are particularly dangerous because they are crafted to survive partial filtering and manipulate model state over multiple turns.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal