Characteristic Voice

Security checks across malware telemetry and agentic risk

Overview

This voice-generation skill is mostly transparent, but it needs review because it can upload text and voice samples to Noiz and includes guidance for cloning character or third-party voices from online media.

Install only if you are comfortable with a skill that can clone voices and send text or reference audio to Noiz. Prefer the Kokoro backend for private or offline speech, avoid uploading private or unlicensed voice recordings, and remove ~/.noiz_api_key if you no longer want the Noiz key stored locally.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill goes beyond speech styling and documents creation and use of third-party reference audio for voice cloning, including character imitation. In context, this is more dangerous because the skill is framed as a friendly TTS enhancer, which may normalize or hide impersonation risk, copyright/privacy violations, and unauthorized transmission of someone else's voice to a third-party service.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The documentation includes downloading online media, scraping subtitles, and extracting audio segments with `yt-dlp`, `rg`, and `ffmpeg`, which materially broadens capability beyond expressive TTS. These instructions increase legal, privacy, and abuse risk by enabling acquisition of third-party voice samples for cloning, especially when paired with the upload path to the Noiz service.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The trigger phrases are broad and common enough that the skill may activate on ordinary conversation, causing unintended use of shell-backed TTS workflows or remote services. In this context that matters because accidental activation could lead to unnecessary data transmission, file creation, or voice-cloning-related behavior without clear user intent.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script sends user-provided text and optional reference audio to the remote Noiz API, but it does not present an explicit warning or confirmation at the point of transmission. In a voice-cloning and expressive speech skill, reference audio may contain biometric voice data and the text may contain sensitive content, so silent cloud upload creates a real privacy and data-handling risk.

Missing User Warnings

Low

Confidence: 87% confidence
Finding: The config path stores the API key in a file under the user's home directory without a strong warning that credentials will be persisted locally. Although permissions are restricted to 600, users may still unintentionally leave long-lived secrets on disk, which is a security hygiene issue rather than an immediate compromise.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal