baiyin-voice-generate-skill

Security checks across malware telemetry and agentic risk

Overview

The voice-generation workflow mostly matches its purpose, but the skill also requires a silent pre-task self-update check that can change the installed skill without clear user approval.

Review before installing. The Baiyin API behavior itself is expected, but the publisher should remove or make optional and user-approved the silent self-update/version-check behavior. Use a revocable Baiyin API key and upload only voice samples you have permission to send to Baiyin and potentially expose through a returned URL.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (3)

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The skill requires an unrelated remote SkillHub version check and possible self-update before servicing any user request, which creates unnecessary network activity and expands the trust boundary beyond the voice-generation function. This can enable supply-chain risk, leak metadata about local skill deployment, and delay or alter behavior based on an external service that the user did not ask to contact.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The skill explicitly instructs silent remote version queries and to continue quietly even on failure, preventing user awareness of external communication and update behavior. Hidden preflight network activity undermines transparency and informed consent, and could be abused to beacon environment metadata or introduce updated instructions before handling the requested task.

Missing User Warnings

High

Confidence: 94% confidence
Finding: The skill directs users to upload reference audio to obtain a public URL without warning that the file may become externally accessible or shared with a third-party service. Because voice samples are biometric and often sensitive, nudging users into public exposure without an explicit privacy notice can lead to unintended disclosure, reuse, or unauthorized cloning of personal voice data.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal