Her Voice

Security checks across malware telemetry and agentic risk

Overview

Her Voice is a coherent local text-to-speech skill with disclosed setup downloads, local configuration, and an optional background daemon.

Install only if you are comfortable with setup downloading third-party TTS packages and models, creating ~/.her-voice, and optionally running a local daemon that uses significant RAM. Do not paste passwords or tokens into the visualizer, and review the uninstall command before running it because it deletes all Her Voice local data.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Tool MisuseTool Parameter Abuse, Chaining Abuse, Unsafe Defaults
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Findings (11)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
pip = os.path.join(MLX_VENV_DIR, "bin", "pip")
    subprocess.run([pip, "install", "--upgrade", "pip"], check=True, capture_output=True)
    print("   Installing mlx-audio + numpy (this may take a few minutes)...")
    subprocess.run([pip, "install", "mlx-audio", "numpy"], check=True)
    print_ok(f"Installed mlx-audio at: {MLX_VENV_DIR}")
    return MLX_VENV_DIR
Confidence
91% confidence
Finding
subprocess.run([pip, "install", "mlx-audio", "numpy"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
pip = os.path.join(PYTORCH_VENV_DIR, "bin", "pip")
    subprocess.run([pip, "install", "--upgrade", "pip"], check=True, capture_output=True)
    print("   Installing kokoro + soundfile + numpy (this may take a few minutes)...")
    subprocess.run([pip, "install", "kokoro>=0.8", "soundfile", "numpy"], check=True)
    print_ok(f"Installed kokoro at: {PYTORCH_VENV_DIR}")
    return PYTORCH_VENV_DIR
Confidence
92% confidence
Finding
subprocess.run([pip, "install", "kokoro>=0.8", "soundfile", "numpy"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"""Download the Kokoro model by doing a test load (MLX)."""
    print_step(f"Downloading model: {model_name}...")
    python = os.path.join(venv_path, "bin", "python3")
    result = subprocess.run([
        python, "-c",
        f"from mlx_audio.tts.utils import load_model; m = load_model(model_path='{model_name}'); print('Model loaded, sample_rate:', getattr(m, 'sample_rate', 24000))"
    ], capture_output=True, text=True, timeout=300)
Confidence
90% confidence
Finding
result = subprocess.run([ python, "-c", f"from mlx_audio.tts.utils import load_model; m = load_model(model_path='{model_name}'); print('Model loaded, sample_rate:', getattr(m, 'sam

Lp3

Medium
Category
MCP Least Privilege
Confidence
91% confidence
Finding
The skill clearly instructs the agent to execute shell commands, write configuration under ~/.her-voice, and manage local files, yet no permissions are declared in metadata. This creates a trust gap: a host or reviewer may underestimate the skill's operational capabilities, increasing the chance of unintended command execution or file modification.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The code implements an extra data flow beyond simple voice playback: on Cmd+V it reads clipboard text, sanitizes it, and sends it over a Unix socket to a local TTS daemon. Even though the daemon is local, clipboard contents often contain sensitive data, and this transfer is not clearly disclosed or constrained in this file, creating an unintended exfiltration surface to another process.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
Clipboard access is triggered by Cmd+V and is not implied by the stated purpose of merely giving the agent a voice. Because clipboard data can contain passwords, tokens, or private messages, adding this capability without prominent disclosure expands the skill's access to sensitive user data in a way users may not expect.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The uninstall section includes a destructive deletion command that removes the entire ~/.her-voice directory, but the warning is mild and not explicit about irreversible data loss. Users or agents may execute it without understanding that configs, binaries, models, logs, and daemon state will be permanently deleted.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The implementation sends clipboard-derived text to a local TTS daemon with no visible warning, confirmation, or disclosure in this file. This creates a silent cross-process transfer of potentially sensitive user data, which is especially risky because users may treat paste as a local UI action rather than consent to transmit data to another component.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The setup flow performs package installs, model downloads, binary compilation, config writes, and even in-place patching of installed library files without an explicit up-front warning or consent checkpoint. In the context of an agent skill installer, hidden side effects increase the chance users will run code they do not fully understand and reduce informed consent for security-sensitive changes.

Tool Parameter Abuse

High
Category
Tool Misuse
Content
Remove all Her Voice data (config, venvs, compiled binary, daemon state):
```bash
python3 SKILL_DIR/scripts/daemon.py stop
rm -rf ~/.her-voice
```

## How It Works
Confidence
96% confidence
Finding
rm -rf ~

Tool Parameter Abuse

High
Category
Tool Misuse
Content
Remove all Her Voice data (config, venvs, compiled binary, daemon state):
```bash
python3 SKILL_DIR/scripts/daemon.py stop
rm -rf ~/.her-voice
```

## How It Works
Confidence
96% confidence
Finding
rm -rf ~/

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal