Discord Voice Using Deepgram

AdvisoryAudited by Static analysis on May 10, 2026.

Overview

Detected: suspicious.env_credential_access, suspicious.exposed_secret_literal

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

ConcernHigh Confidence

ASI02: Tool Misuse and Exploitation

What this means

Anyone whose speech is accepted in the voice channel, or a bad transcription, could potentially trigger whatever tools and skills the user’s main agent can normally use.

Why it was flagged

Transcribed speech is used as the agent prompt while the voice session explicitly grants access to all normal tools and does not use a restricted lane.

Skill content

const extraSystemPrompt = `You are ${agentName}, speaking in a Discord voice channel... You have access to all your normal tools and skills. The user's Discord ID is ${userId}.`; ... prompt: text, ... // lane: "discord-voice",  // Removed - was possibly restricting tool access

Recommendation

Use a restricted voice lane or tool allowlist, require confirmation for high-impact tools, and enforce primaryUser or allowedUsers before the bot joins a channel.

ConcernHigh Confidence

ASI09: Human-Agent Trust Exploitation

What this means

If the installer does not set a primary speaker or allowlist, other people in the Discord voice channel may be able to talk to and control the agent.

Why it was flagged

The actual configuration indicates an empty allowlist permits all users, which is easy to miss because SKILL.md presents the 'only listen to you' safeguard as the default behavior.

Skill content

"allowedUsers": { "type": "array", ... "default": [] } ... "help": "Discord user IDs allowed to use voice (empty = all allowed)"

Recommendation

Make the default deny-by-default, require a primaryUser during setup, and update the documentation to clearly say that an empty allowlist means all channel users are allowed.

ConcernMedium Confidence

ASI06: Memory and Context Poisoning

What this means

Speech from one accepted Discord user may affect the ongoing memory/context used for later voice interactions in the same guild.

Why it was flagged

Voice transcripts are routed into a persistent agent session keyed only by guild, not by individual speaker, with no retention or speaker-boundary controls described.

Skill content

const sessionKey = `discord:voice:${guildId}`; ... const sessionStore = deps.loadSessionStore(storePath); ... await deps.saveSessionStore(storePath, sessionStore); ... sessionFile, ... prompt: text

Recommendation

Document retention clearly, consider per-user sessions, provide a way to clear voice session history, and avoid reusing multi-user voice context for sensitive tasks.

NoteHigh Confidence

ASI03: Identity and Privilege Abuse

What this means

Installing the skill gives the plugin access to your Discord bot identity and your Deepgram account for transcription and speech generation.

Why it was flagged

The skill requires account credentials and Discord bot permissions, which are expected for a Discord voice/STT/TTS integration but still sensitive.

Skill content

- A Discord bot token (`DISCORD_TOKEN`)
- A Deepgram API key (`DEEPGRAM_API_KEY`)
- Discord bot permissions in your server: Connect, Speak, Use Voice Activity

Recommendation

Use a dedicated Discord bot token, limit Discord permissions to the minimum needed, store the Deepgram key as a secret, and rotate keys if removed.

NoteHigh Confidence

ASI07: Insecure Inter-Agent Communication

What this means

Voice audio, transcripts, and agent replies may be processed by Deepgram outside your local OpenClaw environment.

Why it was flagged

The skill discloses the external provider flow: Discord audio and generated text are sent to Deepgram as part of the voice pipeline.

Skill content

Discord voice audio → Deepgram streaming STT (WebSocket)
- Transcript → your agent
- Agent reply → Deepgram TTS (`/v1/speak` streamed HTTP Ogg/Opus)

Recommendation

Confirm that all voice-channel participants consent to this processing and review Deepgram’s data handling settings and retention policy.

NoteHigh Confidence

ASI10: Rogue Agents

What this means

If auto-join is configured, the bot may enter a voice channel and start listening whenever the service starts.

Why it was flagged

The plugin can automatically join a configured voice channel on startup, which is purpose-aligned but creates an ongoing listener if enabled.

Skill content

if (cfg.autoJoinChannel) { ... api.logger.info(`[deepgram-discord-voice] Auto-joining channel ${cfg.autoJoinChannel}`); ... await vm.join(channel as VoiceBasedChannel); }

Recommendation

Leave autoJoinChannel unset unless you explicitly want this behavior, and provide clear status/leave controls for users in the Discord server.

NoteMedium Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

It may be harder to confirm that the reviewed package is the exact package and publisher you intended to install.

Why it was flagged

The package-internal metadata differs from the registry summary, which lists a different owner ID, slug, and version; this is a provenance/versioning gap for a credential-using plugin.

Skill content

"ownerId": "kn738y5j6ep0rq1e9efavpzkth7zyh41", "slug": "deepgram-discord-voice", "version": "0.2.0"

Recommendation

Verify the publisher and source before installing, and prefer a package with matching registry/package metadata and a clear upstream repository.

Findings (4)

critical

suspicious.env_credential_access

Location: src/streaming-tts.ts:22
Finding: Environment variable access combined with network send.

critical

suspicious.env_credential_access

Location: src/stt.ts:22
Finding: Environment variable access combined with network send.

critical

suspicious.env_credential_access

Location: src/tts.ts:23
Finding: Environment variable access combined with network send.

critical

suspicious.exposed_secret_literal

Location: src/streaming-tts.ts:37
Finding: File appears to expose a hardcoded API secret or token.