Alicloud Ai Audio Tts

Security checks across malware telemetry and agentic risk

Overview

This is a mostly coherent Alibaba Cloud text-to-speech skill, but its helper script can send the API key to a request-supplied endpoint and then download an unvalidated returned URL.

Review before installing if you will use real DashScope credentials. Only use official DashScope endpoints, do not pass untrusted request JSON containing `base_url`, keep outputs out of shared directories, and avoid sending or storing sensitive text or audio unless your Alibaba Cloud data-handling requirements allow it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Tainted flow: 'audio_url' from os.getenv (line 109, credential/environment) → urllib.request.urlopen (network output)

Critical

Category: Data Flow
Content: def download_audio(audio_url: str, output_path: Path) -> None: output_path.parent.mkdir(parents=True, exist_ok=True) with urllib.request.urlopen(audio_url) as response: output_path.write_bytes(response.read())
Confidence: 92% confidence
Finding: with urllib.request.urlopen(audio_url) as response:

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill clearly instructs use of environment variables, local file reads/writes, and outbound network access, yet it does not declare any permissions or trust boundaries. This can mislead operators and policy engines about what the skill actually does, reducing review visibility and increasing the chance of overbroad or unmonitored execution.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documentation directs users to save generated audio links, sample audio, and request payloads, but gives no warning that TTS inputs may contain sensitive text and outputs may embed personal, proprietary, or regulated content. Persisting these artifacts without minimization or protection can create unintended data retention and disclosure risks.

Missing User Warnings

Low

Confidence: 82% confidence
Finding: The skill explains where to place the API key, but does not warn against hardcoding, logging, committing credentials, or using weak filesystem permissions on credential files. While common setup guidance is not inherently unsafe, omitting basic secret-handling precautions increases the chance of accidental credential exposure.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal