Her Voice

Security checks across malware telemetry and agentic risk

Overview

Her Voice is a coherent local text-to-speech skill with disclosed setup downloads, local configuration, and an optional background daemon.

Install only if you are comfortable with setup downloading third-party TTS packages and models, creating ~/.her-voice, and optionally running a local daemon that uses significant RAM. Do not paste passwords or tokens into the visualizer, and review the uninstall command before running it because it deletes all Her Voice local data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Tool MisuseTool Parameter Abuse, Chaining Abuse, Unsafe Defaults
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (11)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: pip = os.path.join(MLX_VENV_DIR, "bin", "pip") subprocess.run([pip, "install", "--upgrade", "pip"], check=True, capture_output=True) print(" Installing mlx-audio + numpy (this may take a few minutes)...") subprocess.run([pip, "install", "mlx-audio", "numpy"], check=True) print_ok(f"Installed mlx-audio at: {MLX_VENV_DIR}") return MLX_VENV_DIR
Confidence: 91% confidence
Finding: subprocess.run([pip, "install", "mlx-audio", "numpy"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: pip = os.path.join(PYTORCH_VENV_DIR, "bin", "pip") subprocess.run([pip, "install", "--upgrade", "pip"], check=True, capture_output=True) print(" Installing kokoro + soundfile + numpy (this may take a few minutes)...") subprocess.run([pip, "install", "kokoro>=0.8", "soundfile", "numpy"], check=True) print_ok(f"Installed kokoro at: {PYTORCH_VENV_DIR}") return PYTORCH_VENV_DIR
Confidence: 92% confidence
Finding: subprocess.run([pip, "install", "kokoro>=0.8", "soundfile", "numpy"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: """Download the Kokoro model by doing a test load (MLX).""" print_step(f"Downloading model: {model_name}...") python = os.path.join(venv_path, "bin", "python3") result = subprocess.run([ python, "-c", f"from mlx_audio.tts.utils import load_model; m = load_model(model_path='{model_name}'); print('Model loaded, sample_rate:', getattr(m, 'sample_rate', 24000))" ], capture_output=True, text=True, timeout=300)
Confidence: 90% confidence
Finding: result = subprocess.run([ python, "-c", f"from mlx_audio.tts.utils import load_model; m = load_model(model_path='{model_name}'); print('Model loaded, sample_rate:', getattr(m, 'sam

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill clearly instructs the agent to execute shell commands, write configuration under ~/.her-voice, and manage local files, yet no permissions are declared in metadata. This creates a trust gap: a host or reviewer may underestimate the skill's operational capabilities, increasing the chance of unintended command execution or file modification.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The code implements an extra data flow beyond simple voice playback: on Cmd+V it reads clipboard text, sanitizes it, and sends it over a Unix socket to a local TTS daemon. Even though the daemon is local, clipboard contents often contain sensitive data, and this transfer is not clearly disclosed or constrained in this file, creating an unintended exfiltration surface to another process.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: Clipboard access is triggered by Cmd+V and is not implied by the stated purpose of merely giving the agent a voice. Because clipboard data can contain passwords, tokens, or private messages, adding this capability without prominent disclosure expands the skill's access to sensitive user data in a way users may not expect.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The uninstall section includes a destructive deletion command that removes the entire ~/.her-voice directory, but the warning is mild and not explicit about irreversible data loss. Users or agents may execute it without understanding that configs, binaries, models, logs, and daemon state will be permanently deleted.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The implementation sends clipboard-derived text to a local TTS daemon with no visible warning, confirmation, or disclosure in this file. This creates a silent cross-process transfer of potentially sensitive user data, which is especially risky because users may treat paste as a local UI action rather than consent to transmit data to another component.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The setup flow performs package installs, model downloads, binary compilation, config writes, and even in-place patching of installed library files without an explicit up-front warning or consent checkpoint. In the context of an agent skill installer, hidden side effects increase the chance users will run code they do not fully understand and reduce informed consent for security-sensitive changes.

Tool Parameter Abuse

High

Category: Tool Misuse
Content: Remove all Her Voice data (config, venvs, compiled binary, daemon state): ```bash python3 SKILL_DIR/scripts/daemon.py stop rm -rf ~/.her-voice ``` ## How It Works
Confidence: 96% confidence
Finding: rm -rf ~

Tool Parameter Abuse

High

Category: Tool Misuse
Content: Remove all Her Voice data (config, venvs, compiled binary, daemon state): ```bash python3 SKILL_DIR/scripts/daemon.py stop rm -rf ~/.her-voice ``` ## How It Works
Confidence: 96% confidence
Finding: rm -rf ~/

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal