Web TTS Speaker

Security checks across malware telemetry and agentic risk

Overview

The skill appears to do the advertised text-to-speech job, but it can automatically fetch URLs and send page or text content to external TTS and chat channels without enough confirmation or privacy disclosure.

Review before installing if users may paste private text, internal URLs, customer data, or confidential documents. The skill should ideally require an explicit read-aloud request, warn that text may be sent to Edge TTS and audio files written locally, and avoid auto-processing bare URLs.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (5)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
f.write(f"file '{wav_path}'\n")

        merged_wav = os.path.join(out_dir, f"_merged_{os.getpid()}.wav")
        subprocess.run(
            [FFMPEG, "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", merged_wav],
            capture_output=True, text=True, check=True,
        )
Confidence
76% confidence
Finding
subprocess.run( [FFMPEG, "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", merged_wav], capture_output=True, text=True, check=True, )

Missing User Warnings

Medium
Confidence
83% confidence
Finding
The README encourages fetching arbitrary URLs and converting content through Edge TTS without disclosing that page contents may be transmitted to external services and written to local files. In an agent setting, this can lead users to process sensitive internal pages or private text under false assumptions about locality, causing unintended data exposure or persistence.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The trigger logic is overly permissive because it auto-activates on any pasted URL or very common phrases like 'read this' without a stronger confirmation step. In this skill, activation leads to fetching remote content and sending generated audio through external TTS and messaging components, so unintended invocation can cause unconsented processing of user content and accidental outbound actions.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The skill does not clearly disclose that supplied text or URLs may be fetched, processed by TTS/web components, and then transmitted back through third-party messaging channels. This creates a privacy and consent risk, especially when users may paste sensitive text or internal URLs expecting local-only handling.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill fetches arbitrary URLs and then sends the extracted text to Edge TTS, which means user-supplied content is transmitted to external services without any explicit privacy warning or consent flow. In an agent setting, this can expose internal URLs, sensitive article contents, or private text to third parties unexpectedly.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal