Web TTS Speaker

Security checks across malware telemetry and agentic risk

Overview

The skill appears to do the advertised text-to-speech job, but it can automatically fetch URLs and send page or text content to external TTS and chat channels without enough confirmation or privacy disclosure.

Review before installing if users may paste private text, internal URLs, customer data, or confidential documents. The skill should ideally require an explicit read-aloud request, warn that text may be sent to Edge TTS and audio files written locally, and avoid auto-processing bare URLs.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: f.write(f"file '{wav_path}'\n") merged_wav = os.path.join(out_dir, f"_merged_{os.getpid()}.wav") subprocess.run( [FFMPEG, "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", merged_wav], capture_output=True, text=True, check=True, )
Confidence: 76% confidence
Finding: subprocess.run( [FFMPEG, "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", merged_wav], capture_output=True, text=True, check=True, )

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The README encourages fetching arbitrary URLs and converting content through Edge TTS without disclosing that page contents may be transmitted to external services and written to local files. In an agent setting, this can lead users to process sensitive internal pages or private text under false assumptions about locality, causing unintended data exposure or persistence.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger logic is overly permissive because it auto-activates on any pasted URL or very common phrases like 'read this' without a stronger confirmation step. In this skill, activation leads to fetching remote content and sending generated audio through external TTS and messaging components, so unintended invocation can cause unconsented processing of user content and accidental outbound actions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill does not clearly disclose that supplied text or URLs may be fetched, processed by TTS/web components, and then transmitted back through third-party messaging channels. This creates a privacy and consent risk, especially when users may paste sensitive text or internal URLs expecting local-only handling.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill fetches arbitrary URLs and then sends the extracted text to Edge TTS, which means user-supplied content is transmitted to external services without any explicit privacy warning or consent flow. In an agent setting, this can expose internal URLs, sensitive article contents, or private text to third parties unexpectedly.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal