Minimax Tts

Security checks across malware telemetry and agentic risk

Overview

This skill is not coherent: it advertises Zhipu web search/current-information use, but its executable script sends user text to MiniMax text-to-speech instead.

Review before installing. Do not provide a production API key or route sensitive prompts through this skill until the publisher makes the name, documentation, examples, environment variables, provider, endpoint, and script behavior consistent and clearly states which service receives user text.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (8)

Description-Behavior Mismatch

Medium
Confidence
87% confidence
Finding
The manifest names the skill `minimax-tts`, but the body documents a Zhipu web search integration. Identity mismatches can cause operators or higher-level agents to invoke the skill under false assumptions, leading to unintended external data transmission and weakened review of what the skill actually does.

Intent-Code Divergence

Medium
Confidence
93% confidence
Finding
The document asserts security controls such as JSON escaping, input validation, TLS enforcement, timeout, and safe error handling, but the visible example does not implement those protections. Security claims without implementation create a false sense of assurance and can cause users or systems to trust an unsafe invocation pattern.

Intent-Code Divergence

Medium
Confidence
88% confidence
Finding
The skill documentation is internally contradictory: metadata and headings refer to both web search and TTS, while the example invokes a chat completions endpoint with a `web_tts` tool. This ambiguity can cause the agent or operator to invoke the skill in the wrong context, leading to unintended external requests and mishandling of user data or expectations.

Description-Behavior Mismatch

High
Confidence
99% confidence
Finding
The skill metadata says this capability performs Zhipu web search, but the implementation actually sends user input to MiniMax's text-to-audio API and returns an audio URL. This kind of capability mismatch is dangerous because it defeats user and platform trust boundaries: text intended for search can be silently exfiltrated to an unrelated third party and the skill behaves materially differently than advertised.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The in-file comment explicitly identifies the script as a MiniMax TTS generator, directly contradicting the manifest's claimed web-search behavior. In context, this strengthens evidence of intentional deception or at minimum serious misrepresentation, which can cause users and orchestrators to route sensitive search queries into an unrelated external service.

Vague Triggers

Medium
Confidence
84% confidence
Finding
Broad trigger phrases like 'search for' or 'find information about' can cause over-invocation of the skill for ordinary conversational requests. In this context, that increases the chance of unnecessary external transmission of user prompts to a third-party API and may bypass more appropriate local handling.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The trigger phrases are overly broad and map common language like requests for latest news or finding information to this skill, even though the skill is described as TTS-oriented. Overbroad invocation criteria increase the chance of accidental activation, causing user prompts or sensitive context to be sent to an external service without clear necessity.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The script transmits raw user-provided text to an external TTS provider without any visible disclosure, consent, or data-minimization step. In this skill's context, that is more dangerous because users believe they are invoking a web-search capability, not sending arbitrary prompts or potentially sensitive text to a speech-generation vendor.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal