Wip

Security checks across malware telemetry and agentic risk

Overview

This WIP tracking skill is mostly coherent, but it also directs agents to run external checks and manage shared Claude hook/cache state beyond ordinary task tracking.

Install only if you want an assertive workflow manager that can update/delete task state and, in Claude contexts, run GitHub/deploy status checks and participate in shared Copilot rate-limit handling. Review the ~/.claude cache/hook behavior and broad activation phrases before use, especially in repositories or sessions where gh, ssh, or curl have meaningful credentials or production reach.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (10)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The skill is supposed to track in-session work progress, but this section expands into PR review/merge verification, deployment checks, bot-comment parsing, and shell-hook/rate-limit enforcement. That scope creep gives a low-privilege task-tracking skill authority to drive operational decisions and persistent local state changes unrelated to WIP bookkeeping, increasing the chance of unintended external actions and unsafe automation.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: These instructions direct the agent to run external operational commands such as `gh`, `curl`, and `ssh` based only on task-subject keywords, even though the skill's stated purpose is progress tracking. Embedding command execution into a WIP skill broadens its authority and can trigger networked or repository-affecting actions without sufficient contextual justification.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The shared Copilot rate-limit cache and mandatory hook-management behavior create cross-session persistence in `~/.claude` and influence future command execution globally. For a progress-tracking skill, writing policy files and directing hook cleanup is an unjustified expansion into system-level control that can interfere with other sessions and tools.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The skill permits automatic execution of external CLI and network-backed checks (`gh pr checks`, `gh pr view`, `curl`) based only on keyword matching in a task subject, which exceeds a progress-tracking skill's core purpose. This expands the skill from state management into autonomous command execution and can trigger unintended outbound requests or repository/API interactions without explicit user approval.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The trigger list includes very broad natural-language phrases such as "resume" and "track progress," which can match ordinary conversation rather than an explicit request to invoke this skill. In an agent setting, that increases the chance of unintended activation and execution of workflow actions, especially because the skill can modify task state and initiate follow-up questioning.

Vague Triggers

Medium

Confidence: 81% confidence
Finding: The "When to Use" section contains broad conditions like large tasks, showing progress, or preserving current state, which are common across many normal interactions. This can cause the agent to over-apply the skill in contexts where the user did not ask for persistent tracking, leading to unwanted state changes or intrusive workflow behavior.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The entry procedure mandates immediate deletion of stale completed or in-progress entries with no user confirmation. Because task lists may contain information the user still wants to review, retain, or audit, automatic deletion can cause unintended data loss and may erase context needed for safe task recovery.

Missing User Warnings

Low

Confidence: 85% confidence
Finding: The skill instructs the agent to create and modify persistent artifact files (`task.md` and `ask.md`) as part of normal workflow, but it does not require any explicit user-facing notice that files will be written. In a task-tracking skill this is expected behavior, but lack of clear disclosure can still surprise users and cause unintended state changes in the workspace or conversation storage.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The trigger phrase "task cleanup + remaining work" is broad and resembles ordinary conversational language, increasing the chance that the skill activates unintentionally during unrelated discussion. Because activation leads to reading prior task state and deleting or modifying lines in `task.md`, accidental invocation could cause unwanted file changes or confusing workflow transitions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The workflow explicitly allows deleting tasks and modifying checklist files without a user-facing warning or confirmation, even though those actions can irreversibly alter tracking data. In a resume/cleanup context, accidental invocation or misclassification of tasks could silently remove information the user expected to retain.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal