Security audit

voice

Security checks across malware telemetry and agentic risk

Overview

This Discord voice skill does what it claims, but it needs careful review because spoken channel audio can be sent to external services and converted into full agent actions with broad default access.

Install only if you are comfortable with a Discord bot listening in joined voice channels, sending speech or generated responses to configured third-party services, and letting spoken input reach the host agent's normal tools. Restrict allowedUsers, avoid autoJoinChannel unless needed, disclose recording/transcription to channel participants, and prefer local STT/TTS providers for private conversations.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (14)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill metadata declares required configuration and environment variables and clearly describes networked STT/TTS behavior, but it does not declare permissions corresponding to environment access and outbound network use. This weakens platform trust boundaries because users and automated tooling cannot accurately assess the skill's effective capabilities before installation.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Transcribed speech from any participant in a joined voice channel is routed into the embedded agent with an explicit prompt stating it has access to all normal tools and skills. In a voice setting, this expands untrusted spoken input into broad agent capability execution, increasing the risk of prompt injection, unauthorized tool use, and side effects such as data access or external actions if the agent's toolset is powerful.

Context-Inappropriate Capability

Medium

Confidence: 79% confidence
Finding: The WyomingWhisperSTT provider allows outbound raw TCP connections to an arbitrary configurable host and port, which expands the skill's network reach beyond normal Discord API usage. In an agent/skill environment, this can be abused for data exfiltration, internal network access, or SSRF-style pivoting if untrusted configuration can control the destination.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The README clearly describes capturing users' voice in Discord channels, transcribing it, sending it to external STT providers, processing it through an agent, and speaking back results, but it does not prominently warn about consent, privacy, retention, or third-party data handling. In a multi-user voice environment, this can lead to unconsented collection and transfer of participants' speech to external services, creating privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This skill processes live voice conversations, transcribes speech, sends text/audio to an agent, and may use third-party STT/TTS providers, but the user-facing description does not prominently warn that spoken audio may be recorded and transmitted off-device. In a voice-chat context, lack of explicit disclosure can lead to uninformed collection of sensitive conversations and privacy violations for both users and bystanders in the channel.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The manifest explicitly supports multiple cloud STT/TTS providers and requests API keys, which strongly implies user voice data may be sent off-host to third-party services. Failing to disclose that audio/transcript content may leave the local environment creates a real privacy and consent issue, especially for a real-time Discord voice skill where users may not expect external processing.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This code captures Discord voice audio and sends it to configurable third-party STT providers such as Deepgram, OpenAI, or others, but this file contains no user-facing disclosure, consent flow, or gating before transmission. In a real-time voice skill, that means participants may have their speech exported off-platform without clear notice, creating privacy, compliance, and trust risks even if the behavior is functionally intended.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The skill sends transcribed or generated text to external TTS providers without any explicit warning in this component that content will be processed by third parties. While less sensitive than raw voice in many cases, responses may still include personal, confidential, or server-specific information, so silent export to outside services creates avoidable privacy exposure.

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: https://api.heybossai.com/

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: https://api.heybossai.com/

External Transmission

Medium

Category: Data Exfiltration
Content: async synthesize(text: string): Promise<TTSResult> { // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model const response = await fetch("https://api.heybossai.com/v1/pilot", { method: "POST", headers: { Authorization: `Bearer ${this.apiKey}`,
Confidence: 91% confidence
Finding: https://api.heybossai.com/

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.env_credential_access, suspicious.exposed_secret_literal, suspicious.insecure_tls_verification

Environment variable access combined with network send.

Critical

Code: suspicious.env_credential_access
Location: index.ts:156

Environment variable access combined with network send.

Critical

Code: suspicious.env_credential_access
Location: src/streaming-tts.ts:45

Environment variable access combined with network send.

Critical

Code: suspicious.env_credential_access
Location: src/stt.ts:37

Environment variable access combined with network send.

Critical

Code: suspicious.env_credential_access
Location: src/tts.ts:48

File appears to expose a hardcoded API secret or token.

Critical

Code: suspicious.exposed_secret_literal
Location: src/config.ts:269

File appears to expose a hardcoded API secret or token.

Critical

Code: suspicious.exposed_secret_literal
Location: src/streaming-tts.ts:110

File appears to expose a hardcoded API secret or token.

Critical

Code: suspicious.exposed_secret_literal
Location: src/tts.ts:96

HTTPS certificate verification is disabled.

Warn

Code: suspicious.insecure_tls_verification
Location: index.ts:158