Back to skill

Security audit

Kai Realtime Voice

Security checks across malware telemetry and agentic risk

Overview

This skill is a small MiniMax text-to-speech helper with inaccurate real-time WebSocket wording, but its network use, API key use, and file output are tied to its stated voice-generation purpose.

Install only if you are comfortable sending the text you provide to MiniMax under your API key. Treat this as a REST-based MP3 generation helper, not true real-time WebSocket streaming, and avoid using it for secrets or sensitive private content unless MiniMax's data handling is acceptable for that material.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (6)

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill exposes shell execution capability through documented bash invocations but does not declare corresponding permissions. Undeclared execution capability reduces transparency and can bypass user or platform expectations about what the skill is allowed to do, increasing the risk of unintended command execution or misuse of inherited environment secrets.

Tp4

High
Category
MCP Tool Poisoning
Confidence
95% confidence
Finding
The skill advertises real-time WebSocket voice streaming, but the detected behavior indicates a materially different implementation using a REST endpoint, local file output, and no bidirectional streaming. This mismatch is security-relevant because users may grant trust, provide sensitive audio/text, or make deployment decisions based on false assumptions about data flow, latency, storage, and interaction model.

Missing User Warnings

Low
Confidence
84% confidence
Finding
The skill omits a clear warning that generated audio may be written to a local file or streamed to stdout. In a voice-processing context, undisclosed output handling can expose sensitive synthesized content through logs, pipes, shared terminals, or workspace files, especially in multi-user or automated environments.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
User-supplied text is transmitted to an external third-party API without explicit warning at the point of use. In a voice skill, users may provide sensitive or regulated content, so silent transmission to a remote service creates a real privacy and data-handling risk even if the destination is the intended provider.

External Transmission

Medium
Category
Data Exfiltration
Content
# For real streaming, would use WebSocket here
        # For now, fallback to REST API
        RESP=$(curl -s -X POST "https://api.minimax.io/v1/t2a_v2" \
            -H "Authorization: Bearer ${API_KEY}" \
            -H "Content-Type: application/json" \
            -d "{\"model\":\"speech-02-turbo\",\"text\":\"$TEXT\",\"stream\":false,\"output_format\":\"hex\",\"voice_setting\":{\"voice_id\":\"$VOICE_ID\",\"speed\":1},\"audio_setting\":{\"sample_rate\":32000}}")
Confidence
94% confidence
Finding
curl -s -X POST "https://api.minimax.io/v1/t2a_v2" \ -H "Authorization: Bearer ${API_KEY}" \ -H "Content-Type: application/json" \ -d

External Transmission

Medium
Category
Data Exfiltration
Content
# For real streaming, would use WebSocket here
        # For now, fallback to REST API
        RESP=$(curl -s -X POST "https://api.minimax.io/v1/t2a_v2" \
            -H "Authorization: Bearer ${API_KEY}" \
            -H "Content-Type: application/json" \
            -d "{\"model\":\"speech-02-turbo\",\"text\":\"$TEXT\",\"stream\":false,\"output_format\":\"hex\",\"voice_setting\":{\"voice_id\":\"$VOICE_ID\",\"speed\":1},\"audio_setting\":{\"sample_rate\":32000}}")
Confidence
94% confidence
Finding
https://api.minimax.io/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal