CosyVoice3 macOS

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local text-to-speech setup, but it installs a sizeable external ML stack and should be reviewed before running.

Before installing, read scripts/install.sh and be comfortable with it downloading Miniconda, cloning CosyVoice, installing Python packages, and downloading large model files. Run it only in an environment where you are willing to trust those upstream sources. For voice cloning, use only reference audio you are authorized to use and avoid impersonation or deceptive sharing.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (6)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 89% confidence
Finding: The skill documentation instructs users to execute shell commands and scripts, but the skill does not declare corresponding permissions. This creates a transparency and policy-enforcement gap: an agent or user may treat the skill as lower risk than it really is, despite its ability to run installers, activate environments, and invoke Python from the shell.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 94% confidence
Finding: The stated purpose is local/offline TTS, but the behavior described includes environment bootstrapping, repository setup, and downloading models over the network, including multiple model variants beyond the primary one. This mismatch is dangerous because it expands the trust boundary and attack surface: users may approve a seemingly simple local TTS skill without realizing it performs network retrieval and software installation from external sources.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The installer fetches a remote Miniconda shell script and immediately executes it with bash, creating a classic supply-chain risk. If the download source, transport, or hosted artifact is compromised, arbitrary code will run on the user's machine during installation with the user's privileges.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill prominently enables zero-shot voice cloning from short reference audio but provides no warning about consent, impersonation, privacy, or misuse risks. In this context, the omission matters because voice cloning is inherently sensitive and can facilitate social engineering, fraud, or unauthorized biometric-style voice replication if used on third-party audio.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The script silently downloads and runs the Miniconda installer without an explicit warning or user approval, which materially increases the risk of unexpected code execution. Even if the source is legitimate, users are not given a chance to assess trust, review the action, or opt out before a substantial bootstrap step occurs.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The script performs repository cloning and writes into a hard-coded workspace path under /Users/lhz/.openclaw/workspace, which can unexpectedly modify the filesystem in a user-specific location. This is risky because it assumes a particular username/path layout and can overwrite or interfere with existing contents without warning.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal