Discord Voice

Security checks across malware telemetry and agentic risk

Overview

The Discord voice feature is coherent, but by default it can let anyone in a joined voice channel talk to a full-capability agent and leave shared persistent context unless tightly restricted.

Before installing, only use this in trusted Discord servers, set allowedUsers instead of leaving it empty, avoid auto-joining public channels, prefer local STT/TTS for private speech, keep TLS verification enabled, and use limited-scope Discord/API credentials.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

ASI01: Agent Goal Hijack

High

What this means

People in the Discord voice channel could verbally steer the agent, not just the installer or an explicitly trusted user.

Why it was flagged

The code acknowledges that, with the default empty allowlist, any user in joined Discord voice channels can interact with the bot and trigger API calls, while transcribed speech is routed to the agent.

Skill content

Handle transcribed speech - route to agent and get response ... No allowedUsers configured — all users in joined channels can interact with the bot and trigger API calls.

Recommendation

Set allowedUsers to explicit trusted Discord user IDs, avoid public voice channels, and require confirmation or a restricted mode for voice-origin instructions.

ASI02: Tool Misuse and Exploitation

High

What this means

A spoken prompt could potentially cause the agent to use broader tools or take actions beyond a simple voice reply.

Why it was flagged

The changelog indicates the voice-routed agent is not confined to a restricted lane, creating a broad tool-use surface for prompts originating from Discord voice.

Skill content

Remove lane restriction to allow full tool access

Recommendation

Use a limited tool lane for Discord voice, require approval for high-impact tools, and separate casual voice chat from administrative agent capabilities.

ASI06: Memory and Context Poisoning

Medium

What this means

One person’s spoken instructions could affect later conversations in the same Discord server.

Why it was flagged

The voice session is keyed at the guild level and saved, so multiple users in the same guild can share and persist context rather than being isolated per user or channel.

Skill content

const sessionKey = `discord:voice:${guildId}`; ... sessionStore[sessionKey] = sessionEntry; await deps.saveSessionStore(storePath, sessionStore);

Recommendation

Use per-user or per-channel sessions, add expiration/reset controls, and clear sessions after untrusted or public-channel use.

ASI03: Identity and Privilege Abuse

Low

What this means

Compromised or over-permissive tokens could allow bot misuse or unexpected provider charges.

Why it was flagged

The plugin requires a Discord bot token and can use provider API keys; this is expected for Discord voice/STT/TTS integration but grants meaningful account and billing authority.

Skill content

"discord.token": { "required": true ... } ... "OPENAI_API_KEY" ... "ELEVENLABS_API_KEY" ... "DEEPGRAM_API_KEY"

Recommendation

Use least-privilege Discord bot permissions, restrict provider keys, keep keys out of shared configs, and rotate keys if exposed.

ASI07: Insecure Inter-Agent Communication

Low

What this means

Voice content may leave Discord/OpenClaw and be processed by the selected speech providers.

Why it was flagged

Spoken audio/transcripts and response text may be sent to configured third-party STT/TTS providers; this is disclosed and purpose-aligned but involves sensitive voice data.

Skill content

Speech-to-Text: Whisper API (OpenAI), Deepgram, or Local Whisper ... Text-to-Speech: OpenAI TTS, ElevenLabs, Deepgram Aura, Amazon Polly, Edge TTS

Recommendation

Choose local providers for private conversations, review provider retention policies, and avoid using the bot in channels where sensitive speech may be captured.

ASI10: Rogue Agents

Low

What this means

If configured, the bot may join and remain active in a voice channel without a fresh manual join command each time.

Why it was flagged

The plugin can maintain or restore a voice connection and optionally auto-join on startup; this is disclosed and aligned with a voice bot, but it is persistent behavior users should control.

Skill content

Auto-reconnect: Automatic heartbeat monitoring and reconnection on disconnect ... autoJoinChannel ... Channel ID to auto-join on startup

Recommendation

Enable autoJoinChannel only for trusted private channels and confirm the bot leaves when voice use is finished.

ASI04: Agentic Supply Chain Vulnerabilities

Low

What this means

Installing dependencies runs the normal npm supply chain for this plugin.

Why it was flagged

The skill relies on local npm dependency installation even though the registry says there is no install spec; package-lock is present, but users still need to trust the package source and dependency chain.

Skill content

cd ~/.openclaw/extensions/discord-voice && npm install

Recommendation

Install from the intended repository/package, review package-lock changes, and avoid running npm install from an untrusted clone.