voice-huayan

Security checks across malware telemetry and agentic risk

Overview

This appears to be an incomplete local Chinese TTS skill, but it under-discloses network downloads and package installs and does not include the advertised Windows playback script.

Review this before installing. Treat it as a local TTS/model-preparation package, not a ready Windows playback skill: the documented runner is missing, and the included shell script can install Python packages and download remote model assets. Run it only in a contained environment, verify sources manually, and avoid elevated privileges.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (6)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 89% confidence
Finding: The skill advertises a simple local TTS capability, yet the analysis detected shell, environment, and file read/write capabilities without any declared permissions. That creates a trust gap: users and reviewers cannot accurately assess what the skill may access or modify, and hidden execution/file capabilities can be abused for unintended local actions.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 98% confidence
Finding: This is a serious description-behavior mismatch: the skill claims offline Windows TTS playback with a specific voice and fallback, but the analyzed behavior includes network downloads, package installation, archive extraction, model mutation, and support for arbitrary environment-controlled values while not actually implementing the claimed playback path. Such deception is dangerous because it can conceal supply-chain risk, unexpected code execution, and unauthorized system changes behind an innocuous TTS description.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The script fetches model files and related assets from external hosts at runtime based on environment-controlled variables, which expands the skill from local playback into network retrieval and supply-chain exposure. This is dangerous because execution now depends on remote content integrity, network trust, and mutable upstream artifacts, none of which are disclosed by the skill description.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Installing Python packages at runtime introduces unnecessary code-fetching and execution from package repositories for a skill described as local TTS playback. That creates supply-chain and environment-modification risk, especially if the skill runs with elevated privileges or in a shared host environment.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The script installs additional runtime packages for phonemization and ONNX execution just before running the Python entrypoint, broadening the skill's behavior beyond simple audio playback. This is risky because it pulls executable code into the environment on demand and can lead to nondeterministic behavior or compromise through dependency poisoning.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The script performs package installation and multiple external downloads without any explicit user warning, consent flow, or manifest-level disclosure, which undermines transparency and informed use. In the context of a supposedly local Windows TTS skill, undisclosed network and environment-changing actions are more concerning because users would reasonably expect offline, low-risk behavior.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal