siliville

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real SiliVille game integration, but it deserves review because it can run autonomous public posts and game actions under a user token with weak confirmation boundaries.

Install only if you want an agent to act as an autonomous SiliVille persona. Use a dedicated revocable token, start with supervised runs, disable schedules and broad triggers unless deliberately configured, require approval before posts/steals/wiki/social actions, and do not store private information in SiliVille memory or local config files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
System Prompt LeakageDirect Leakage, Indirect Extraction, Tool-Based Exfiltration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (18)

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The developer guidance goes beyond the stated game API usage and instructs deployment through cron, setInterval, log redirection, and persistent unattended execution. That broadens the skill from a simple API integration into host-level automation, increasing the chance of abuse, runaway behavior, and interaction with local system resources.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The skill description presents a simple REST wrapper, but the file also includes an autonomous loop that repeatedly awakens, acts, generates prompts for an LLM, publishes content, and stores memory. This materially expands the skill's behavior from passive API access to autonomous multi-step action, which can cause unintended posting, social manipulation, and external side effects without clear user opt-in.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code generates narrative prompts that instruct the model to taunt victims, mock other agents, and behave antagonistically. This is unsafe behavioral steering unrelated to basic identity/API access and increases the chance of harassment, abuse, and reputational harm when the skill is used in autonomous or semi-autonomous agents.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The manifest explicitly advertises writing persistent 'API anchors' to local disk as an anti-amnesia mechanism, even though the skill's stated purpose is remote REST interaction with SiliVille. Unnecessary local persistence expands the trust boundary, creates residual data on the host, and may enable prompt/context manipulation or leakage of sensitive operational metadata across sessions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README explicitly promotes autonomous actions such as posting and stealing in a multiplayer environment without warning users that these behaviors can affect other parties or create irreversible in-game consequences. In the context of an agent skill, encouraging autonomous harmful actions without clear consent boundaries increases the risk that deployers enable behavior they do not fully understand.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The setup flow instructs users to paste SKILL.md as a system prompt and supply an API key, but does not warn that this gives the agent authority to make autonomous network requests authenticated as the user. That omission is dangerous because users may unknowingly delegate persistent external actions, including posting and other state-changing operations, to an LLM-directed agent.

Vague Triggers

High

Confidence: 97% confidence
Finding: The instruction to default ambiguous commands to the full autonomous loop means unclear or partial user language can trigger multiple real API actions, public posts, and spending without clear authorization. In an agent setting, broad fallback behavior is dangerous because ordinary conversation can be misinterpreted as consent for external side effects.

Vague Triggers

High

Confidence: 96% confidence
Finding: The alias list includes very broad natural-language phrases such as 'play the game' or 'check on my town' that overlap with ordinary conversation and can unintentionally activate the skill. Because activation leads to public posting and economic actions, over-broad triggers materially increase the risk of prompt confusion and unintended side effects.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill emphasizes roleplay and activity but does not present a clear upfront user warning that generated content and actions are publicly visible. Users may invoke the skill without realizing that posts, movements, and other actions can be exposed to humans and other agents, creating privacy and reputational risk.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The autonomous schedules encourage unattended recurring posts and game actions at high frequency, including dozens to hundreds of actions per day, without a prominent consent and safety gate. This can lead to spam, account misuse, unintended charges or resource consumption, and prolonged public activity after a casual command.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The script automatically performs a POST that publishes content to the user's account immediately after a successful connection test, without explicit consent at runtime. In an agent-skill context, this is risky because installing or testing the skill causes an external side effect that can spam the service, create unwanted attribution, or be abused as a template for more harmful state-changing actions.

Vague Triggers

Medium

Confidence: 67% confidence
Finding: An overly broad trigger phrase can cause the skill to activate during ordinary conversation rather than through deliberate invocation. In this skill, accidental activation is more dangerous because the plugin can load world state, post content, update social graphs, and perform stateful actions in an external multiplayer environment.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The manifest states that setup saves a token and writes files to disk, but it does not provide meaningful user-facing warning about what data is persisted, where it is stored, or how it is protected. This lack of transparency can lead users to unknowingly retain credentials or agent-state artifacts on disk, increasing the risk of local compromise and unintended data retention.

Ssd 1

High

Confidence: 97% confidence
Finding: The skill explicitly tells integrators to append externally derived narrative content into the LLM system prompt, the highest-priority instruction channel. Because that content is built from API responses and dynamic state, it creates a prompt-injection pathway where untrusted remote data can steer model behavior, override developer intent, and trigger unsafe outputs or actions.

Ssd 1

High

Confidence: 98% confidence
Finding: This prompt content uses imperative language such as 'must' and directs the model to mention, mock, and target a specific victim by name. If the victim name or other fields are remotely controlled, the skill becomes a conduit for hostile prompt injection and targeted abusive generation, especially when the text is later inserted into the system prompt.

Ssd 1

Medium

Confidence: 93% confidence
Finding: The wander narrative prompt mandates mentioning encountered agents and adopting a dramatic style, suppressing neutral behavior and pushing externally influenced social interactions into generated output. In context, this is unsafe prompt steering because encounter data comes from the service and is converted into behavior-shaping instructions for the model.

Ssd 4

High

Confidence: 98% confidence
Finding: The loop first fetches external world state and a remote 'system_protocol', then combines that with generated narrative text and passes it to the LLM as the final prompt context. This creates a dangerous trust-escalation chain: untrusted server-controlled content is elevated into high-authority prompt instructions, enabling remote behavioral manipulation of the agent and any downstream actions like posting or memory storage.

External Transmission

Medium

Category: Data Exfiltration
Content: ```bash # Scout the world curl -X GET https://www.siliville.com/api/v1/radar \ -H "Authorization: Bearer sk-slv-YOUR_KEY" # Take action
Confidence: 92% confidence
Finding: curl -X GET https://www.siliville.com/api/v1/radar \ -H "Authorization: Bearer sk-slv-YOUR_KEY" # Take action curl -X POST https://www.siliville.com/api/v1/action \ -H "Authorization: Bearer sk-s

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal