voice-huayan

Security checks across malware telemetry and agentic risk

Overview

This appears to be an incomplete local Chinese TTS skill, but it under-discloses network downloads and package installs and does not include the advertised Windows playback script.

Review this before installing. Treat it as a local TTS/model-preparation package, not a ready Windows playback skill: the documented runner is missing, and the included shell script can install Python packages and download remote model assets. Run it only in a contained environment, verify sources manually, and avoid elevated privileges.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (6)

Lp3

Medium
Category
MCP Least Privilege
Confidence
89% confidence
Finding
The skill advertises a simple local TTS capability, yet the analysis detected shell, environment, and file read/write capabilities without any declared permissions. That creates a trust gap: users and reviewers cannot accurately assess what the skill may access or modify, and hidden execution/file capabilities can be abused for unintended local actions.

Tp4

High
Category
MCP Tool Poisoning
Confidence
98% confidence
Finding
This is a serious description-behavior mismatch: the skill claims offline Windows TTS playback with a specific voice and fallback, but the analyzed behavior includes network downloads, package installation, archive extraction, model mutation, and support for arbitrary environment-controlled values while not actually implementing the claimed playback path. Such deception is dangerous because it can conceal supply-chain risk, unexpected code execution, and unauthorized system changes behind an innocuous TTS description.

Description-Behavior Mismatch

High
Confidence
95% confidence
Finding
The script fetches model files and related assets from external hosts at runtime based on environment-controlled variables, which expands the skill from local playback into network retrieval and supply-chain exposure. This is dangerous because execution now depends on remote content integrity, network trust, and mutable upstream artifacts, none of which are disclosed by the skill description.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
Installing Python packages at runtime introduces unnecessary code-fetching and execution from package repositories for a skill described as local TTS playback. That creates supply-chain and environment-modification risk, especially if the skill runs with elevated privileges or in a shared host environment.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The script installs additional runtime packages for phonemization and ONNX execution just before running the Python entrypoint, broadening the skill's behavior beyond simple audio playback. This is risky because it pulls executable code into the environment on demand and can lead to nondeterministic behavior or compromise through dependency poisoning.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The script performs package installation and multiple external downloads without any explicit user warning, consent flow, or manifest-level disclosure, which undermines transparency and informed use. In the context of a supposedly local Windows TTS skill, undisclosed network and environment-changing actions are more concerning because users would reasonably expect offline, low-risk behavior.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal