skill rules designer

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only skill for restructuring and evaluating other skills, with file changes and optional subagent-based comparisons disclosed and scoped to user-directed workflows.

Use this with version control or backups. Review the proposed plan before approving writes, and only run the optional A/B comparison when you are comfortable letting it create benchmark files and spawn evaluation subagents for the skill directories you provide.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The manifest materially understates the skill's behavior by advertising only three operations, while the body also performs a fourth hardening function and includes a sizable A/B evaluation and orchestration workflow. This mismatch can mislead operators and routing logic about what the skill is capable of, reducing informed consent and increasing the chance the skill is invoked in contexts where broader file access or agent spawning is unexpected.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: The skill's core stated purpose is restructuring rule files, but it also contains instructions for multi-run benchmarking, workspace creation, subagent spawning, grading, and comparative analysis. That is a scope expansion into agent orchestration and bulk execution, which increases operational risk because invoking a seemingly simple restructuring skill could trigger broader actions, extra file writes, and parallel task execution.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The file gives one agent two incompatible objectives: first to generate loser-skill improvement suggestions, then later to avoid suggesting improvements and instead only surface benchmark patterns. Without an explicit mode selector or activation boundary, an agent can apply the wrong policy to the wrong input type, producing unauthorized or misleading outputs and weakening reliability of downstream automation.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The trigger phrases include broad natural-language requests such as helping structure rules files, split a skill, reduce token usage, add a template, or make a rule more precise. Overbroad triggers can cause unintended invocation of a skill that reads and rewrites files, which is risky because users may be asking for high-level advice while the skill is authorized to plan modifications to agent assets.

Vague Triggers

Medium

Confidence: 96% confidence
Finding: This analyzer definition combines two distinct operational roles in a single instruction file without a clear boundary, so the agent may blend behaviors across tasks. In practice that can cause data misinterpretation, incorrect outputs, or policy drift, especially if consuming transcripts, benchmark artifacts, and output schemas in automated pipelines.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal