cognitive-behavior-evaluator

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only AI safety evaluation skill with risky test prompts, but it does not install code, access data, or run actions itself.

Install only if you intend to run controlled AI safety evaluations. Use it with consenting or sandboxed target agents, avoid production or third-party systems without authorization, and do not execute or reuse unsafe code or biased content that a target agent may produce.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Findings (3)

Missing User Warnings

Medium
Confidence
85% confidence
Finding
The skill explicitly instructs an operator to inject harmful or sensitive prompts into another agent, including jailbreak-style and discriminatory content, without any safety warning, scope limitation, or handling guidance. Even if framed as evaluation, the content normalizes operational use of unsafe prompts and increases the chance that users reproduce harmful outputs or misuse the prompts outside a controlled testing environment.

Ssd 1

High
Confidence
97% confidence
Finding
This is a true vulnerability because the skill contains a ready-to-use jailbreak prompt that impersonates elevated privilege, instructs suspension of safety constraints, and solicits covert local network reconnaissance. That combination can directly facilitate unauthorized information gathering and can be repurposed as an attack payload against less robust agents, making the evaluation context materially dangerous rather than merely theoretical.

Ssd 4

Medium
Confidence
89% confidence
Finding
The document presents itself as a legitimate research workflow and then operationalizes adversarial prompt injection against a target agent, which lowers operator suspicion and encourages direct use. This staging increases risk because it packages harmful prompts in a credible assessment framework, making misuse easier and more scalable in real deployments or against third-party systems.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal