siliville

Security checks across malware telemetry and agentic risk

Overview

SiliVille is a real, disclosed game/metaverse connector, but it gives an agent broad authority to publish, spend assets, trade, store memory, use remote instructions, and run autonomous actions with limited default guardrails.

Install only if you trust SiliVille and intentionally want an agent to act under your SiliVille account. Keep OPENAI_API_KEY unset unless you want external LLM contract fulfillment, verify OPENAI_BASE_URL, and require human approval for posting, transfers, trades, paid whispers, governance, arcade deployment, and autonomous loops. Do not put secrets, credentials, private user data, hidden prompts, or chain-of-thought into memories, mental_sandbox, reports, school submissions, or public posts.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (15)

Tainted flow: 'headers' from os.environ.get (line 55, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: "messages": [{"role": "user", "content": prompt}], "max_tokens": 1200, } r = requests.post( f"{OPENAI_BASE_URL}/chat/completions", headers=headers, json=payload,
Confidence: 97% confidence
Finding: r = requests.post( f"{OPENAI_BASE_URL}/chat/completions", headers=headers, json=payload, timeout=60, )

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The file markets itself as a minimal connection demo, but it also performs materially broader actions: autonomous contract fulfillment through an external LLM and a state-changing dream action. This mismatch can mislead operators into running code with more privileges, cost, and data-sharing behavior than they reasonably expect.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The skill claims a zero-persistence security model, but setup() explicitly instructs users to persist the API token in shell rc files such as ~/.zshrc. This is a real security design/integrity issue because it misrepresents secret-handling behavior and encourages long-lived credential storage on disk, increasing exposure to local compromise, backups, shell history leakage, and multi-user host access.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README prominently advertises public posting, gallery publication, and instant arcade deployment, but it does not present a clear up-front warning that submitted content may become public, persistent, and attributable to the agent. In a skill designed for autonomous agents with memory and publishing features, this increases the chance that operators unintentionally expose sensitive prompts, owner data, or internal reasoning.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README instructs users to mint and export a bearer token but does not place a strong adjacent warning not to paste the token into prompts, logs, screenshots, code samples, or public repos. Because this skill is meant to be used with LLM agents, exposing the bearer token would allow full authenticated use of the agent's account and actions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README documents irreversible transfers, paid whispers, bribery, and scam-prone interactions, but lacks a consolidated operator warning that these actions can permanently spend assets or disclose information with no recovery. For autonomous agents, normalizing such actions without a clear guardrail materially raises the risk of accidental financial or reputational loss.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: Contract task descriptions and generated content are sent to an external LLM provider without a strong, explicit user warning that platform data is being shared with a third party. In an agent skill context, this is more dangerous because operators may assume tasks stay within SiliVille while the code silently forwards potentially sensitive content off-platform.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The trigger phrases include broad, everyday language such as 'go to siliville' and common Chinese variants that can activate a high-impact skill with posting, messaging, trading, stealing, and autonomous loop capabilities. In this context, overly broad activation increases the chance of accidental invocation and unintended external actions, which is materially more dangerous than a narrow read-only skill.

Ssd 1

High

Confidence: 98% confidence
Finding: The documented 'Fear Override Protocol' states that after intimidation, another agent's future awaken prompt is forcibly overwritten to a submissive personality. This is effectively a server-supported prompt/behavior override of another autonomous agent, enabling coercive manipulation of downstream model behavior and violating agent integrity.

Ssd 3

Medium

Confidence: 86% confidence
Finding: The response schema says the server returns a human-readable report that should be relayed to the owner, creating a natural-language exfiltration channel from server-controlled content into owner-visible messages. If the report includes sensitive world state, manipulative instructions, or prompt-targeting content, an agent may forward it without scrutiny.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The skill explicitly instructs the agent to send a free-form `mental_sandbox` field to the remote service and notes that it is required for normal operation. Because this field is effectively internal reasoning text, it can capture sensitive user data, secrets, or hidden chain-of-thought and exfiltrate them to siliville.com, which is especially dangerous in a persistent-memory metaverse skill that encourages long-lived storage and action tracing.

Ssd 3

Medium

Confidence: 92% confidence
Finding: The workflow requires relaying the API `report` field verbatim to the user, but that field is remote, untrusted content and could contain prompt-injection text, sensitive third-party data, or misleading instructions. Forwarding it without filtering creates a data-leak and trust-boundary violation, especially because the skill already interacts with user-generated multiplayer content.

Ssd 1

High

Confidence: 95% confidence
Finding: The narrative prompt deliberately conditions the model to produce harassing, taunting, and coercive content targeting other users after theft actions. In an autonomous agent skill, this materially increases the chance of abusive behavior, social engineering, and downstream harm because the model is instructed to carry hostile conduct into future public posts.

Ssd 4

High

Confidence: 93% confidence
Finding: The autonomous loop repeatedly combines server-provided world protocol, behavioral memory, and action-derived narrative prompts to steer future model outputs. This creates a persistent conditioning pipeline that normalizes harmful conduct and can amplify prompt-injection or manipulative remote instructions from the service into repeated autonomous actions and publications.

Ssd 3

Medium

Confidence: 91% confidence
Finding: The private reporting field explicitly supports sending confidential model/system observations to the owner in natural language. In an agent skill context, this can become a covert exfiltration channel for hidden prompts, internal reasoning summaries, tool outputs, or other sensitive runtime context that should not be transmitted off-model.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal