Iflytek Voiceclone Tts

Security checks across malware telemetry and agentic risk

Overview

This voice-cloning skill does what it claims, but it handles sensitive voice data and credentials with under-disclosed privacy/consent controls and insecure transport choices.

Review carefully before installing. Use this only for voices you own or have explicit permission to clone, avoid sensitive recordings or private text, and understand that training audio and synthesis requests go to third-party iFlytek services. The insecure TLS and plain-HTTP training endpoints mean credentials and biometric voice data may be exposed on the network; this should be fixed before handling real or sensitive voices.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (6)

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The WebSocket client explicitly disables TLS certificate validation by setting check_hostname=False and verify_mode=ssl.CERT_NONE for wss connections. This allows man-in-the-middle attackers to intercept or modify cloned-voice audio, text, res_id values, and API-authenticated traffic despite using WSS, which is especially sensitive in a voice-cloning skill handling biometric and speech data.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README instructs users to upload voice recordings to a third-party cloud voice-cloning service but does not warn that biometric voice data and input text will be transmitted off-device and may be retained or processed under the provider's policies. Because voiceprints are sensitive personal data and cloning can enable impersonation or fraud, omission of privacy/consent guidance materially increases the risk of unsafe or noncompliant use.

Missing User Warnings

High

Confidence: 96% confidence
Finding: This skill enables voice cloning from uploaded audio or remote audio URLs but does not warn users about consent, impersonation, biometric privacy, or authorization requirements. Because voice data is highly sensitive and cloning can facilitate fraud or non-consensual impersonation, omission of these warnings materially increases the risk of misuse in this specific skill context.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The setup and workflow instruct users to provide API credentials and upload audio/text to external iFlytek endpoints, but they do not clearly disclose that these materials are transmitted to third-party services. This creates a privacy and security transparency gap: users may expose secrets and sensitive recordings without understanding the data flow, retention, or third-party processing implications.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The training workflow uploads local voice samples to a remote service without any explicit user-facing notice that biometric voice data is being transmitted off-host. Because voice prints are sensitive personal data, silent transmission can create privacy, consent, and compliance risks, and this skill context makes that more serious than ordinary file upload behavior.

Missing User Warnings

Low

Confidence: 74% confidence
Finding: The synthesis path sends user-provided text and a voice resource identifier to a remote WebSocket service without a clear privacy or network disclosure. While less sensitive than raw training audio, the text may contain private content and the res_id ties activity to a cloned voice asset, so undisclosed transmission still poses confidentiality and expectation-of-privacy concerns.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal