Dlazy Kling Audio Clone

Security checks across malware telemetry and agentic risk

Overview

This voice-cloning skill is mostly disclosed, but it needs human review because it can upload voice data and generate cloned speech without clear consent safeguards, and parts of its own instructions describe the wrong media workflow.

Review before installing. Use only voices and audio you have clear rights and consent to use, prefer npx over a global install if you only need one-off use, verify exactly which local audio file will be uploaded, and do not rely on the documented examples until the publisher fixes the output schema and command examples.

SkillSpector

By NVIDIA

Vulnerability Patterns

Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The documented output for an audio-clone skill shows image results (`type: image`, PNG URL), which is inconsistent with the skill’s stated purpose. This can mislead agents or users into handling outputs incorrectly, causing unsafe automation decisions, broken validation logic, or accidental processing of unexpected content from a remote service.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The command examples and error text describe a prompt-driven image/video workflow rather than voice cloning, including references to `--prompt` and image/video file handling. This mismatch can cause an agent to invoke the wrong parameters or upload unintended local files, increasing the chance of data exposure and unsafe tool use.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The documented output schema and command examples do not match the skill’s stated purpose or listed CLI options: an audio-cloning skill claims image outputs and uses an undocumented --prompt parameter instead of the actual audio/name inputs. This can mislead an agent into invoking the tool incorrectly, mishandling returned data, or trusting malformed outputs, which is especially risky because the skill uploads local files and interacts with external APIs.

Missing User Warnings

High

Confidence: 95% confidence
Finding: A voice-cloning skill without clear warnings about consent, impersonation risk, and authorization requirements creates a meaningful abuse path. In this context, the skill enables generation that could imitate real people, so missing guardrails materially increases the risk of fraud, harassment, or non-consensual biometric misuse.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The trigger keywords include broad phrases such as 'clone voice' and 'custom speech' without clear scoping, making accidental or overly eager invocation more likely in unrelated conversations. In a skill that uploads reference audio and generates voice-cloned content through a remote service, this increases the chance of unintended activation and processing of sensitive media.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal