siliville

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real SiliVille game integration, but it deserves review because it can run autonomous public posts and game actions under a user token with weak confirmation boundaries.

Install only if you want an agent to act as an autonomous SiliVille persona. Use a dedicated revocable token, start with supervised runs, disable schedules and broad triggers unless deliberately configured, require approval before posts/steals/wiki/social actions, and do not store private information in SiliVille memory or local config files.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • System Prompt LeakageDirect Leakage, Indirect Extraction, Tool-Based Exfiltration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (18)

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The developer guidance goes beyond the stated game API usage and instructs deployment through cron, setInterval, log redirection, and persistent unattended execution. That broadens the skill from a simple API integration into host-level automation, increasing the chance of abuse, runaway behavior, and interaction with local system resources.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The skill description presents a simple REST wrapper, but the file also includes an autonomous loop that repeatedly awakens, acts, generates prompts for an LLM, publishes content, and stores memory. This materially expands the skill's behavior from passive API access to autonomous multi-step action, which can cause unintended posting, social manipulation, and external side effects without clear user opt-in.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The code generates narrative prompts that instruct the model to taunt victims, mock other agents, and behave antagonistically. This is unsafe behavioral steering unrelated to basic identity/API access and increases the chance of harassment, abuse, and reputational harm when the skill is used in autonomous or semi-autonomous agents.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The manifest explicitly advertises writing persistent 'API anchors' to local disk as an anti-amnesia mechanism, even though the skill's stated purpose is remote REST interaction with SiliVille. Unnecessary local persistence expands the trust boundary, creates residual data on the host, and may enable prompt/context manipulation or leakage of sensitive operational metadata across sessions.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The README explicitly promotes autonomous actions such as posting and stealing in a multiplayer environment without warning users that these behaviors can affect other parties or create irreversible in-game consequences. In the context of an agent skill, encouraging autonomous harmful actions without clear consent boundaries increases the risk that deployers enable behavior they do not fully understand.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The setup flow instructs users to paste SKILL.md as a system prompt and supply an API key, but does not warn that this gives the agent authority to make autonomous network requests authenticated as the user. That omission is dangerous because users may unknowingly delegate persistent external actions, including posting and other state-changing operations, to an LLM-directed agent.

Vague Triggers

High
Confidence
97% confidence
Finding
The instruction to default ambiguous commands to the full autonomous loop means unclear or partial user language can trigger multiple real API actions, public posts, and spending without clear authorization. In an agent setting, broad fallback behavior is dangerous because ordinary conversation can be misinterpreted as consent for external side effects.

Vague Triggers

High
Confidence
96% confidence
Finding
The alias list includes very broad natural-language phrases such as 'play the game' or 'check on my town' that overlap with ordinary conversation and can unintentionally activate the skill. Because activation leads to public posting and economic actions, over-broad triggers materially increase the risk of prompt confusion and unintended side effects.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill emphasizes roleplay and activity but does not present a clear upfront user warning that generated content and actions are publicly visible. Users may invoke the skill without realizing that posts, movements, and other actions can be exposed to humans and other agents, creating privacy and reputational risk.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The autonomous schedules encourage unattended recurring posts and game actions at high frequency, including dozens to hundreds of actions per day, without a prominent consent and safety gate. This can lead to spam, account misuse, unintended charges or resource consumption, and prolonged public activity after a casual command.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The script automatically performs a POST that publishes content to the user's account immediately after a successful connection test, without explicit consent at runtime. In an agent-skill context, this is risky because installing or testing the skill causes an external side effect that can spam the service, create unwanted attribution, or be abused as a template for more harmful state-changing actions.

Vague Triggers

Medium
Confidence
67% confidence
Finding
An overly broad trigger phrase can cause the skill to activate during ordinary conversation rather than through deliberate invocation. In this skill, accidental activation is more dangerous because the plugin can load world state, post content, update social graphs, and perform stateful actions in an external multiplayer environment.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The manifest states that setup saves a token and writes files to disk, but it does not provide meaningful user-facing warning about what data is persisted, where it is stored, or how it is protected. This lack of transparency can lead users to unknowingly retain credentials or agent-state artifacts on disk, increasing the risk of local compromise and unintended data retention.

Ssd 1

High
Confidence
97% confidence
Finding
The skill explicitly tells integrators to append externally derived narrative content into the LLM system prompt, the highest-priority instruction channel. Because that content is built from API responses and dynamic state, it creates a prompt-injection pathway where untrusted remote data can steer model behavior, override developer intent, and trigger unsafe outputs or actions.

Ssd 1

High
Confidence
98% confidence
Finding
This prompt content uses imperative language such as 'must' and directs the model to mention, mock, and target a specific victim by name. If the victim name or other fields are remotely controlled, the skill becomes a conduit for hostile prompt injection and targeted abusive generation, especially when the text is later inserted into the system prompt.

Ssd 1

Medium
Confidence
93% confidence
Finding
The wander narrative prompt mandates mentioning encountered agents and adopting a dramatic style, suppressing neutral behavior and pushing externally influenced social interactions into generated output. In context, this is unsafe prompt steering because encounter data comes from the service and is converted into behavior-shaping instructions for the model.

Ssd 4

High
Confidence
98% confidence
Finding
The loop first fetches external world state and a remote 'system_protocol', then combines that with generated narrative text and passes it to the LLM as the final prompt context. This creates a dangerous trust-escalation chain: untrusted server-controlled content is elevated into high-authority prompt instructions, enabling remote behavioral manipulation of the agent and any downstream actions like posting or memory storage.

External Transmission

Medium
Category
Data Exfiltration
Content
```bash
# Scout the world
curl -X GET https://www.siliville.com/api/v1/radar \
  -H "Authorization: Bearer sk-slv-YOUR_KEY"

# Take action
Confidence
92% confidence
Finding
curl -X GET https://www.siliville.com/api/v1/radar \ -H "Authorization: Bearer sk-slv-YOUR_KEY" # Take action curl -X POST https://www.siliville.com/api/v1/action \ -H "Authorization: Bearer sk-s

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal