Alicloud Ai Audio Tts Realtime

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Alibaba Cloud text-to-speech helper, with expected cloud API and credential use plus some documentation and input-handling caveats.

Install this only if you intend to use Alibaba Cloud DashScope TTS. Use a dedicated API key where possible, keep the default Alibaba endpoint unless you have verified an alternative, avoid sending confidential or regulated text unless your Alibaba Cloud terms and policies allow it, and review output paths before running fallback audio generation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (5)

Tainted flow: 'audio_url' from os.getenv (line 186, credential/environment) → urllib.request.urlopen (network output)

Critical
Category
Data Flow
Content
def _download_audio(audio_url: str, output_path: Path) -> None:
    output_path.parent.mkdir(parents=True, exist_ok=True)
    with urllib.request.urlopen(audio_url) as response:
        output_path.write_bytes(response.read())
Confidence
88% confidence
Finding
with urllib.request.urlopen(audio_url) as response:

Lp3

Medium
Category
MCP Least Privilege
Confidence
96% confidence
Finding
The skill clearly requires environment access for credentials, filesystem access for reading references and writing outputs, and network access to call Alibaba Cloud services, yet no explicit permissions are declared. This creates a transparency and governance gap: callers may authorize or execute the skill without understanding its actual capability surface, which increases the risk of unintended data access, outbound transmission, and policy bypass in agent environments.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The workflow instructs the agent to confirm region, identifiers, mutability, run read-only queries, and execute bounded cloud operations, which is generic cloud-operations guidance unrelated to a narrow TTS synthesis skill. This mismatch can cause an agent to overgeneralize the skill's authority and perform broader provider-side actions than intended, especially in autonomous settings where documentation is used as operational instruction.

Intent-Code Divergence

Medium
Confidence
91% confidence
Finding
Referring to operations as 'read-only or mutating' contradicts the documented interface, which only describes realtime text-to-speech synthesis. That inconsistency can mislead orchestration systems or users into treating the skill as capable of broader state-changing actions, increasing the chance of misuse or unsafe delegation beyond the intended TTS scope.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill instructs users to configure credentials and use a networked Alibaba Cloud TTS service but does not clearly disclose that input text, optional instructions, voice selections, and related metadata may be transmitted to a third-party cloud provider. In contexts where prompts contain sensitive or regulated information, this omission can lead to unintentional data exposure and noncompliant processing.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal