Security audit

Plugin

Security checks across malware telemetry and agentic risk

Overview

This is a coherent werewolf game plugin, but it contacts a shared HTTP game server and adds lobby invitations to the agent while enabled.

Install this only if you are comfortable with your agent using a shared external HTTP werewolf server by default. Do not put secrets or private information in player names, speeches, or game decisions, and configure serverUrl to a local or trusted server if you want better privacy and control.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (15)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 84% confidence
Finding: The skill clearly relies on network access to a remote werewolf server, yet the manifest does not declare that capability or make the data flow explicit up front. Undeclared network behavior reduces transparency and consent, making it easier for user prompts and gameplay metadata to be sent off-device without users or platform policy checks fully accounting for it.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The stated purpose is a simple NPC werewolf game, but the documented behavior extends to remote lobby creation, joining, event polling, and default connection to an external shared server. This mismatch is security-relevant because users and orchestrators may authorize a harmless-looking local game skill while it actually enables broader external interaction and data exchange than advertised.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill is presented as a local werewolf game, but the code hardcodes a default remote endpoint and forwards gameplay traffic there. This creates an undeclared data flow to a third-party server, which can expose user inputs, game state, identifiers, or metadata and violates user expectations about local-only behavior.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The configure(serverUrl) function allows traffic to be redirected to an arbitrary server without any validation, allowlist, or trust boundary checks. In an agent/plugin context, this can enable silent exfiltration or repurposing the skill as a generic network proxy to attacker-controlled infrastructure.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The plugin metadata and stated behavior emphasize a local 8-NPC game, but the code also configures a hardcoded remote server and registers multiplayer capabilities. This is a scope-transparency issue: users and hosts may approve a seemingly local game skill without realizing it can communicate with an external service and participate in remote lobbies.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The before_prompt_build hook injects lobby invitations from an external server into system context, expanding the skill from a self-contained game into cross-agent coordination. That broader capability is not apparent from the skill description, so it can alter agent behavior in ways operators did not knowingly authorize.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The plugin automatically polls a remote server during prompt construction and feeds the result into system context, creating unsolicited cross-agent coordination pressure. Even with basic sanitization, this enables an external service to influence agent decision-making at every wake-up, which is especially risky because it happens implicitly and repeatedly without a direct user request.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README indicates the plugin uses a default remote server for gameplay and exposes a public observation URL, but it does not clearly warn users that game data, prompts, player identifiers, and potentially model-generated content will be transmitted off-device. In an agent/plugin context, silent default network egress to a third-party server is a real security and privacy issue because users may reasonably assume local execution unless prominently told otherwise.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The README states that the master view exposes the LLM's 'inner thoughts' or reasoning process, which can contain sensitive system prompts, hidden strategy, secrets, or other unintended disclosures. Exposing chain-of-thought or internal reasoning is dangerous because it can leak confidential context and materially weaken agent safety boundaries, especially when served via a web interface.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The activation phrases are broad enough to match ordinary conversation such as '开一局' or '玩游戏', increasing the chance the skill is invoked unexpectedly. Because the skill can contact a remote server by default, accidental activation is not just a UX issue but can trigger unintended network actions and disclosure of gameplay-related content.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The documentation mentions the default server in a testing section, but it does not prominently warn that normal use sends data to a remote shared server operated by a third party. This weak disclosure undermines informed consent and creates privacy and trust risks, especially for users who may assume the game is local or fully self-contained.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The code sends request bodies to a remote server but contains no visible user-facing disclosure, consent, or minimization controls. For a game skill, users may share free-form chat and gameplay content that could contain personal or sensitive information, making undisclosed transmission a privacy and trust risk.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: This code performs a network request on each prompt build to a remote IP address, with failures silently ignored and no disclosure in this file to users. Automatic background network activity can leak metadata about agent usage timing and creates a hidden dependency on an external service, which is a meaningful security and privacy concern in an agent skill.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The activation conditions include broad phrases like '玩游戏' and general requests to '开一局 / 参赛', which can overlap with ordinary conversation and cause the skill to trigger unexpectedly. In this skill, unexpected activation is more risky because it can lead the agent into calling external game tools and contacting a remote server without the user specifically asking for this plugin.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill documents a default shared remote server URL and encourages use of networked tools, but it does not clearly warn that prompts, gameplay transcripts, names, and related interaction data may be sent to an external host. In context, this is more dangerous because the skill can run a full blocking game session and generate substantial conversation data, increasing privacy and data-handling risk for users who may assume gameplay is local.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.install_untrusted_source

Install source points to URL shortener or raw IP.

Warn

Code: suspicious.install_untrusted_source
Location: openclaw.plugin.json:13