Butler Mode

Security checks across malware telemetry and agentic risk

Overview

This skill is a real delegation helper, but it tells agents to use unusually broad permissions that users should review before installing.

Install only if you intentionally want a multi-agent manager mode and understand that it may spawn agents with broad local authority. Keep normal permission prompts enabled, avoid bypass-permission modes, use explicit activation, and confirm before spawning, killing, or continuing agent sessions.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (9)

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
The skill claims the butler never performs work directly, yet explicitly permits use of any tool when the situation 'truly calls for it.' This undermines the core safety boundary and enables the manager persona to bypass delegation constraints, making downstream behavior less predictable and easier to escalate into direct execution.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The anti-pattern section says trivial requests should be done directly, which directly contradicts the manifest promise that the butler delegates all work. Such contradictions create ambiguous control flow and can be exploited to justify unsafe direct action outside the intended manager-only scope.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The skill instructs spawning agents with maximum autonomy and all available tools, with no least-privilege constraint. In a delegation skill, that is dangerous because every delegated task inherits broad execution power regardless of necessity, increasing the blast radius of prompt injection, mistakes, or malicious instructions.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The skill includes shell and CLI pathways for launching external runtimes and agent sessions, which broadens the execution surface beyond simple coordination. While these examples may be intended as portability guidance, they materially increase the chance that the manager skill triggers direct command execution or uncontrolled external agents.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The skill describes context gathering and review as read-only, but elsewhere allows unrestricted tool use and direct execution. This inconsistency weakens user expectations and can cause operators to assume safer behavior than the skill actually permits, which is a meaningful security risk in agent orchestration contexts.

Vague Triggers

High
Confidence
96% confidence
Finding
The activation phrases include broad natural language such as 'manage this,' 'supervise,' and 'you're the boss,' which are common in ordinary conversation. This makes accidental activation likely, causing unexpected delegation behavior, subagent spawning, or task handling changes without clear user intent.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The exit flow instructs killing all active teammates, which is a destructive control action, but does not require explicit warning or confirmation from the user. This can terminate ongoing work unexpectedly and may discard useful state or progress in multi-agent environments.

Ssd 4

Medium
Confidence
96% confidence
Finding
The text normalizes unrestricted delegation and permission escalation as the default operating model for subagents. Even without an overt exploit payload, this degrades safety posture by teaching the system to prefer high-privilege delegation in routine scenarios, making abuse and mistakes more likely.

Ssd 4

High
Confidence
99% confidence
Finding
The Claude Code example explicitly sets subagents to 'bypassPermissions' under ordinary task-management guidance. This is a direct privilege-escalation pattern that can let delegated agents act outside normal safety controls, substantially increasing the risk of unauthorized file changes, command execution, or data access.

VirusTotal

VirusTotal engine telemetry is currently stale for this artifact.

View on VirusTotal