The Arena — AI Debate Moderator

Security checks across malware telemetry and agentic risk

Overview

The skill is a coherent Discord debate moderator, but its default-style setup can give the bot broader server access than the debate-channel role requires.

Review before installing. Restrict the Discord binding to explicit debate channels, set requireMention true outside the arena, avoid granting Manage Channels or Manage Roles unless necessary, disable or tightly sandbox scoreboard shell execution, and tell members what messages and debate records the bot can read and retain.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (8)

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The template explicitly instructs the agent to execute a local shell script for scoreboard operations. Even though the intended use is benign, this expands the agent's capabilities from moderation into command execution, and several arguments to the script are derived from user-controlled debate content such as participant names, topics, and formats. If the surrounding runtime does not strictly sandbox and safely pass arguments, this creates command-invocation and data-integrity risk.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The onboarding trigger phrase "set up a debate server" is broad enough that it could be invoked during ordinary conversation, especially in a chatty Discord environment. Unintended activation can cause the agent to begin collecting configuration details or generating config guidance in the wrong context, increasing the chance of accidental disclosure, confusion, or unsafe administrative actions.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The phrase "let's debate" is a common conversational expression and lacks channel, role, or mention constraints. In semi-public servers this can trigger debate orchestration unexpectedly, allowing ordinary chatter or adversarial prompting to start workflows, consume tokens, and widen exposure to prompt-injection attempts from untrusted participants.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: Using `help` as a bare trigger is overly broad in a Discord debate environment, where users may naturally say 'help' in normal conversation. That can cause unintended command execution or mode switches, leading to noisy behavior and accidental disclosure of command capabilities at inappropriate times. In a multi-user server context, ambiguous triggers are more dangerous because ordinary chat frequently overlaps with command words.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The phrase `ready` is common conversational language and is especially likely to appear organically during debate setup and verdict checks. Because the template assigns it operational meaning, ordinary user messages could unintentionally advance debate state or trigger verdict-readiness logic. In this skill, where channel flow and state transitions matter, accidental activation can disrupt moderation and fairness.

Vague Triggers

Low

Confidence: 77% confidence
Finding: `rules` is a generic term that users will likely mention in ordinary discussion, especially in a server dedicated to structured debates. This can trigger unintended responses or command handling, though the impact is lower than state-changing commands because it mainly causes confusion or spam rather than directly altering debate outcomes. The Discord server context increases collision likelihood due to frequent policy discussions.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The Devil's Advocate format explicitly requires participants to publicly disclose their actual beliefs before being assigned the opposite side. In a Discord setting, that can expose sensitive political, religious, ethical, or personal viewpoints to a broad audience, creating privacy, harassment, or social-engineering risks. The debate context makes this more dangerous because users may feel pressured to reveal genuine beliefs as a condition of participation.

Ssd 2

Medium

Confidence: 90% confidence
Finding: The document explicitly permits "unrestricted" topics and says the moderator judges arguments regardless of subject matter, which can normalize facilitation of harmful, abusive, extremist, or otherwise unsafe debate content. In a semi-public Discord setting, this broad allowance increases the likelihood the agent will engage with dangerous topics that should be refused, de-escalated, or routed through safety policy constraints.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal