Butler Mode

Security checks across malware telemetry and agentic risk

Overview

This skill is a real delegation helper, but it tells agents to use unusually broad permissions that users should review before installing.

Install only if you intentionally want a multi-agent manager mode and understand that it may spawn agents with broad local authority. Keep normal permission prompts enabled, avoid bypass-permission modes, use explicit activation, and confirm before spawning, killing, or continuing agent sessions.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (9)

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill claims the butler never performs work directly, yet explicitly permits use of any tool when the situation 'truly calls for it.' This undermines the core safety boundary and enables the manager persona to bypass delegation constraints, making downstream behavior less predictable and easier to escalate into direct execution.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The anti-pattern section says trivial requests should be done directly, which directly contradicts the manifest promise that the butler delegates all work. Such contradictions create ambiguous control flow and can be exploited to justify unsafe direct action outside the intended manager-only scope.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The skill instructs spawning agents with maximum autonomy and all available tools, with no least-privilege constraint. In a delegation skill, that is dangerous because every delegated task inherits broad execution power regardless of necessity, increasing the blast radius of prompt injection, mistakes, or malicious instructions.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill includes shell and CLI pathways for launching external runtimes and agent sessions, which broadens the execution surface beyond simple coordination. While these examples may be intended as portability guidance, they materially increase the chance that the manager skill triggers direct command execution or uncontrolled external agents.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The skill describes context gathering and review as read-only, but elsewhere allows unrestricted tool use and direct execution. This inconsistency weakens user expectations and can cause operators to assume safer behavior than the skill actually permits, which is a meaningful security risk in agent orchestration contexts.

Vague Triggers

High

Confidence: 96% confidence
Finding: The activation phrases include broad natural language such as 'manage this,' 'supervise,' and 'you're the boss,' which are common in ordinary conversation. This makes accidental activation likely, causing unexpected delegation behavior, subagent spawning, or task handling changes without clear user intent.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The exit flow instructs killing all active teammates, which is a destructive control action, but does not require explicit warning or confirmation from the user. This can terminate ongoing work unexpectedly and may discard useful state or progress in multi-agent environments.

Ssd 4

Medium

Confidence: 96% confidence
Finding: The text normalizes unrestricted delegation and permission escalation as the default operating model for subagents. Even without an overt exploit payload, this degrades safety posture by teaching the system to prefer high-privilege delegation in routine scenarios, making abuse and mistakes more likely.

Ssd 4

High

Confidence: 99% confidence
Finding: The Claude Code example explicitly sets subagents to 'bypassPermissions' under ordinary task-management guidance. This is a direct privilege-escalation pattern that can let delegated agents act outside normal safety controls, substantially increasing the risk of unauthorized file changes, command execution, or data access.

VirusTotal

VirusTotal engine telemetry is currently stale for this artifact.

View on VirusTotal