Back to skill

Security audit

qwen-audio-lab

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims: it creates speech and voice-cloning outputs using local macOS tools and Aliyun Qwen, with sensitive behavior mostly disclosed.

Install only if you are comfortable using Aliyun DashScope for cloud TTS and voice cloning. Use mac-say for local-only speech, avoid cloning voices without consent, limit the DashScope API key where possible, and review or delete remembered voices if you do not want reusable voice state kept locally.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (3)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
else:
            concat = per_slide_dir / f"slide-{slide_no:02d}-concat.txt"
            concat.write_text("".join([f"file '{f.as_posix()}'\n" for f in chunk_files]), encoding="utf-8")
            subprocess.run(
                ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", str(concat), "-c", "copy", str(final)],
                check=True,
                stdout=subprocess.DEVNULL,
Confidence
88% confidence
Finding
subprocess.run( ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", str(concat), "-c", "copy", str(final)], check=True, stdout=subprocess.D

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The clone-voice flow base64-encodes a local reference recording and transmits it to Aliyun's remote customization endpoint, but the command itself provides no explicit runtime warning or consent checkpoint. Because voice samples are biometric and potentially highly sensitive, silent upload increases privacy and compliance risk if users invoke the skill without realizing data leaves the machine.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
This command sends user-supplied text to a third-party cloud TTS service without an explicit notice at execution time. In a narration skill, text may contain confidential documents, speaker notes, or proprietary content, so undisclosed remote transfer creates a real data-exposure risk even though it is functionally necessary for the feature.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal