Back to skill

Security audit

voice

Security checks across malware telemetry and agentic risk

Overview

This Discord voice skill does what it claims, but it needs careful review because spoken channel audio can be sent to external services and converted into full agent actions with broad default access.

Install only if you are comfortable with a Discord bot listening in joined voice channels, sending speech or generated responses to configured third-party services, and letting spoken input reach the host agent's normal tools. Restrict allowedUsers, avoid autoJoinChannel unless needed, disclose recording/transcription to channel participants, and prefer local STT/TTS providers for private conversations.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (14)

Lp3

Medium
Category
MCP Least Privilege
Confidence
92% confidence
Finding
The skill metadata declares required configuration and environment variables and clearly describes networked STT/TTS behavior, but it does not declare permissions corresponding to environment access and outbound network use. This weakens platform trust boundaries because users and automated tooling cannot accurately assess the skill's effective capabilities before installation.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
Transcribed speech from any participant in a joined voice channel is routed into the embedded agent with an explicit prompt stating it has access to all normal tools and skills. In a voice setting, this expands untrusted spoken input into broad agent capability execution, increasing the risk of prompt injection, unauthorized tool use, and side effects such as data access or external actions if the agent's toolset is powerful.

Context-Inappropriate Capability

Medium
Confidence
79% confidence
Finding
The WyomingWhisperSTT provider allows outbound raw TCP connections to an arbitrary configurable host and port, which expands the skill's network reach beyond normal Discord API usage. In an agent/skill environment, this can be abused for data exfiltration, internal network access, or SSRF-style pivoting if untrusted configuration can control the destination.

Missing User Warnings

Medium
Confidence
85% confidence
Finding
The README clearly describes capturing users' voice in Discord channels, transcribing it, sending it to external STT providers, processing it through an agent, and speaking back results, but it does not prominently warn about consent, privacy, retention, or third-party data handling. In a multi-user voice environment, this can lead to unconsented collection and transfer of participants' speech to external services, creating privacy and compliance risk.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
This skill processes live voice conversations, transcribes speech, sends text/audio to an agent, and may use third-party STT/TTS providers, but the user-facing description does not prominently warn that spoken audio may be recorded and transmitted off-device. In a voice-chat context, lack of explicit disclosure can lead to uninformed collection of sensitive conversations and privacy violations for both users and bystanders in the channel.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The manifest explicitly supports multiple cloud STT/TTS providers and requests API keys, which strongly implies user voice data may be sent off-host to third-party services. Failing to disclose that audio/transcript content may leave the local environment creates a real privacy and consent issue, especially for a real-time Discord voice skill where users may not expect external processing.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
This code captures Discord voice audio and sends it to configurable third-party STT providers such as Deepgram, OpenAI, or others, but this file contains no user-facing disclosure, consent flow, or gating before transmission. In a real-time voice skill, that means participants may have their speech exported off-platform without clear notice, creating privacy, compliance, and trust risks even if the behavior is functionally intended.

Missing User Warnings

Low
Confidence
84% confidence
Finding
The skill sends transcribed or generated text to external TTS providers without any explicit warning in this component that content will be processed by third parties. While less sensitive than raw voice in many cases, responses may still include personal, confidential, or server-specific information, so silent export to outside services creates avoidable privacy exposure.

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
fetch("https://api.heybossai.com/v1/pilot", { method: "POST"

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
https://api.heybossai.com/

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
https://api.heybossai.com/

External Transmission

Medium
Category
Data Exfiltration
Content
async synthesize(text: string): Promise<TTSResult> {
    // TTS via SkillBoss API Hub /v1/pilot — auto-routes to best TTS model
    const response = await fetch("https://api.heybossai.com/v1/pilot", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
Confidence
91% confidence
Finding
https://api.heybossai.com/

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.env_credential_access, suspicious.exposed_secret_literal, suspicious.insecure_tls_verification

Environment variable access combined with network send.

Critical
Code
suspicious.env_credential_access
Location
index.ts:156

Environment variable access combined with network send.

Critical
Code
suspicious.env_credential_access
Location
src/streaming-tts.ts:45

Environment variable access combined with network send.

Critical
Code
suspicious.env_credential_access
Location
src/stt.ts:37

Environment variable access combined with network send.

Critical
Code
suspicious.env_credential_access
Location
src/tts.ts:48

File appears to expose a hardcoded API secret or token.

Critical
Code
suspicious.exposed_secret_literal
Location
src/config.ts:269

File appears to expose a hardcoded API secret or token.

Critical
Code
suspicious.exposed_secret_literal
Location
src/streaming-tts.ts:110

File appears to expose a hardcoded API secret or token.

Critical
Code
suspicious.exposed_secret_literal
Location
src/tts.ts:96

HTTPS certificate verification is disabled.

Warn
Code
suspicious.insecure_tls_verification
Location
index.ts:158