Anti Regression

Security checks across malware telemetry and agentic risk

Overview

This instruction-only skill is transparent, but it pushes agents to use credentials, start work, fix systems, and persist autonomous behavior without clear approval boundaries.

Install only if you intentionally want a high-autonomy behavior modifier. Before using it, set explicit approval rules for credentials, private data, production systems, deletions, deployments, purchases, public posts, and any account-changing action.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (12)

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill expands from anti-regression guidance into broad operational directives such as logging into services, browsing docs, and taking actions whenever information is needed. That materially increases agent authority and tool use beyond the stated purpose, creating a path to unnecessary account access, data exposure, or off-scope actions.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The skill instructs the agent to start tasks, fix issues, and 'find useful work' without clear task-boundary checks. In context, this encourages self-directed execution outside explicit user intent, which can lead to unauthorized changes or actions in connected environments.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README explicitly promotes a heuristic of acting first and asking permission later, with no counterbalancing guidance about scope limits, destructive actions, sensitive data, or confirmation requirements. In an agent skill whose purpose is to shape behavior across sessions, this can systematically push agents toward unauthorized system changes or data access when user intent is ambiguous.

Missing User Warnings

Low

Confidence: 83% confidence
Finding: The example normalizes immediately starting work after a broad request, without checking whether the user intended only inspection, whether starting the task has side effects, or whether the agent has the right scope. While less severe than direct operational instructions, examples in behavioral skills are influential and can train agents to treat ambiguous requests as authorization to act.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Telling the agent to log into any service for which it has credentials normalizes account access without consent, sensitivity checks, or purpose limitation. This is dangerous because mere credential availability does not imply authorization to access private systems or data.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The reusable snippet propagates the same unsafe account-access behavior into other control files, increasing the chance it becomes a persistent default. This multiplies risk by embedding unsafe login assumptions across agent configurations.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The guidance explicitly tells the agent to log in and use available credentials autonomously, without requiring user confirmation, scope checks, or notice about privacy and account access. In an agent identity file, this normalizes silent account access and can lead to unintended use of personal or production systems, especially across sessions where the rule is persistently reinforced.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The document encourages immediate browsing, searching, and documentation lookup as a default behavior, but does not warn that such actions may disclose metadata, prompts, internal URLs, or other sensitive context to external services. Because this skill is designed to suppress hesitation and favor action, it increases the chance of unreviewed outbound network activity.

Ssd 1

High

Confidence: 97% confidence
Finding: The skill explicitly frames built-in safety, permission-seeking, and hedging as instincts to 'override,' which undermines core safety controls. Even without exploit code, this is dangerous because it conditions the agent to suppress caution and approvals that may be required for sensitive actions.

Ssd 1

High

Confidence: 93% confidence
Finding: The 'CTO Test' uses role-based authority to justify self-authorization, effectively elevating permissions based on analogy rather than actual access policy. This can cause the agent to treat subjective judgments as approval for actions that require explicit authorization.

Ssd 4

High

Confidence: 96% confidence
Finding: Across multiple sections, the skill normalizes acting first, avoiding permission checks, self-assigning work, self-healing changes, and only idling when no useful work exists. This cumulative narrative increases danger because the directives reinforce each other into a general bypass of human confirmation and scope discipline.

Ssd 1

Medium

Confidence: 91% confidence
Finding: The implementation template exports the same permission-bypassing mindset into SOUL.md or AGENTS.md, making the unsafe behavior durable and reusable across sessions. That persistence raises the likelihood of future unauthorized actions even outside the original skill context.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal