Clonev

Security checks across malware telemetry and agentic risk

Overview

Clonev is a clear voice-cloning tool, but it includes instructions to send cloned audio to Telegram and keeps copied voice samples locally without clear cleanup.

Before installing, confirm you are comfortable running a Docker-based voice-cloning tool, use it only with voices you have permission to clone, do not let it send audio unless you explicitly approve the channel and recipient, and periodically delete retained samples from the configured voice-samples directory.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

#
ASI09: Human-Agent Trust Exploitation
Medium
What this means

A user or recipient could be tricked by audio that sounds like a real person who did not consent.

Why it was flagged

The skill explicitly advertises cloning voices of people other than the user; without consent or labeling safeguards in the main workflow, generated audio can be used to mislead listeners.

Skill content
Works with: Any voice! Yours, a celebrity, a character, etc.
Recommendation

Use only voices you own or have explicit permission to clone, and clearly label generated audio as synthetic.

#
ASI02: Tool Misuse and Exploitation
Medium
What this means

The agent could send cloned-voice audio to a messaging channel before the user has confirmed the recipient, channel, or consent.

Why it was flagged

The quick reference turns a generic generation request into a Telegram send action, which is a third-party sharing action not explicitly requested in that example.

Skill content
USER: "Clone my voice and say 'hello'" ... → Send: message action=send channel=telegram asVoice=true filePath="$VOICE"
Recommendation

Require an explicit user request and final confirmation before sending or publishing any cloned voice audio.

#
ASI06: Memory and Context Poisoning
Medium
What this means

A sensitive voice sample may remain stored locally after use, which could expose or reuse a person's voice without the user realizing it.

Why it was flagged

The script copies the user-provided voice sample into a persistent local samples directory and only deletes the generated WAV, leaving the original voice sample retained for future access or reuse.

Skill content
cp "$VOICE_SAMPLE" "${COQUI_DIR}/voice-samples/" ... rm -f "$OUTPUT_WAV"
Recommendation

Disclose the retained sample location, ask before retaining voice samples, use per-run filenames, and provide a cleanup option that deletes copied samples.

#
ASI04: Agentic Supply Chain Vulnerabilities
Low
What this means

The tool may download and run changing third-party container code, and it may fail or behave differently if Docker or ffmpeg are unavailable.

Why it was flagged

The skill runs Docker and ffmpeg and uses an unpinned 'latest' container image, while the registry metadata declares no required binaries or install spec.

Skill content
docker run --rm --entrypoint "" ... ghcr.io/coqui-ai/tts:latest ... ffmpeg -y -i "$OUTPUT_WAV"
Recommendation

Declare Docker and ffmpeg requirements, pin the container image by version or digest, and document the expected download and runtime behavior.