Reinforced Thinking Mode

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a disclosed multi-round analysis workflow that writes and cleans up its own scratch files, with some clarity issues but no evidence of deception, exfiltration, or harmful behavior.

Install if you want a file-backed multi-round analysis workflow. Before using it on sensitive or regulated material, confirm where it will create scratch files, ask it to retain intermediates if you need auditability, and invoke it explicitly rather than relying on broad trigger phrases.

SkillSpector

By NVIDIA

Vulnerability Patterns

Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (6)

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The skill defines a strict file-access boundary for each round, then later instructs the agent to read all rounds during synthesis. This contradiction creates ambiguous authority around what files may be accessed, increasing the chance an implementation overreads intermediate artifacts or ignores intended isolation controls.

Intent-Code Divergence

Low

Confidence: 89% confidence
Finding: The cleanup section gives conflicting retention instructions, saying to keep final_report.md while also containing malformed text implying a different retained artifact. Ambiguous deletion/retention rules are dangerous because agents may delete needed outputs or preserve unintended sensitive intermediates.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The activation triggers use broad terms like 'deep thinking' and 'comprehensive evaluation,' which can match many normal prompts and cause the skill to activate unexpectedly. In this skill, unintended activation matters because it leads to multi-round autonomous file operations and potentially unnecessary data persistence/deletion.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill directs the agent to create a directory and write files automatically without warning or obtaining user consent. Undisclosed data-affecting operations are risky because they can alter the workspace, create unwanted artifacts, and interact badly with sensitive or constrained environments.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The activation criteria are very broad and match common requests such as deep analysis, design, planning, and risk assessment. This can cause the skill to trigger unintentionally, changing agent behavior in situations where the user did not explicitly request multi-round file-based processing, which increases the chance of unexpected data handling and workflow side effects.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill instructs automatic deletion of intermediate files while only briefly noting that retention may be requested in the prompt. Without a strong, explicit warning and confirmation, users may lose artifacts needed for auditability, reproducibility, or recovery, especially because the workflow creates multiple intermediate analysis files by design.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal