Smart Speak Multilingual TTS

Security checks across malware telemetry and agentic risk

Overview

This appears to be a purpose-aligned text-to-speech skill, but users should understand that it runs local commands and may send supplied text to the Edge TTS service.

Install only if you are comfortable running local edge-tts and ffmpeg on text you provide. Verify the hardcoded paths for your environment, keep outputs inside your intended workspace, and avoid using the skill for secrets or sensitive personal content unless you accept the external TTS data handling.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (3)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill instructs the agent to invoke a Python script via the shell and write an MP3 to a fixed workspace path, but it declares no permissions for shell execution or file writing. This creates a trust and containment gap: reviewers and policy engines may not realize the skill can execute commands and create files, increasing the chance of unsafe invocation or inadequate sandboxing.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The skill promises intelligent Pinyin-to-Hanzi conversion, language detection, and safe voice selection, but the described execution path relies on caller-supplied segments and voices without implementing those controls. This mismatch is dangerous because users and orchestrators may trust the skill to normalize mixed-language content safely and accurately, when in reality untrusted input may flow directly into shell-invoked processing and produce misleading or unsafe behavior.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The script forwards user-provided text to an external TTS program, which typically implies transmission to a remote service, without any notice, consent flow, or data-classification guardrails. In a skill context, users may provide sensitive text assuming local processing, so this creates a privacy and data-handling risk rather than code execution.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal