Voice (Edge TTS)

Security checks across malware telemetry and agentic risk

Overview

This text-to-speech skill appears purpose-aligned, but its implementation and install flow create real review-worthy host security risk.

Install only after reviewing or patching the subprocess code, especially the tts and speak paths. Prefer a version that uses spawn or equivalent argument arrays everywhere, pins dependencies, limits playback to files the skill generated, and clearly discloses that text is sent to an external TTS service. Use a sandbox or non-sensitive machine if you cannot patch it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The skill exposes an install action that executes `pip3 install edge-tts` via a subprocess at runtime. This expands the skill's authority beyond text-to-speech into system modification, and in an agent context can trigger unreviewed package installation, network access, and environment changes without explicit user approval.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The `play` action accepts an arbitrary `filePath` and passes it to local media players, allowing the skill to access and act on local files unrelated to generated speech. In the context of a voice/TTS skill, this is unnecessary capability expansion and could be abused to interact with sensitive or unexpected local resources, or to cause unwanted audio execution on the host.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Installing dependencies through a subprocess without a prior user-facing warning or confirmation creates a silent side effect with system-wide consequences. In agent environments, this can lead to unauthorized package installation, unexpected network activity, and supply-chain exposure if the package source or version is not tightly controlled.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The script sends supplied text to the external edge-tts service, which can expose sensitive or personal content to a network service without any explicit notice, consent flow, or privacy controls. In a skill context, users may assume local playback and may unknowingly transmit secrets, internal data, or regulated information off-host.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal