Smart Speak Multilingual TTS

Security checks across malware telemetry and agentic risk

Overview

This appears to be a purpose-aligned text-to-speech skill, but users should understand that it runs local commands and may send supplied text to the Edge TTS service.

Install only if you are comfortable running local edge-tts and ffmpeg on text you provide. Verify the hardcoded paths for your environment, keep outputs inside your intended workspace, and avoid using the skill for secrets or sensitive personal content unless you accept the external TTS data handling.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (3)

Lp3

Medium
Category
MCP Least Privilege
Confidence
92% confidence
Finding
The skill instructs the agent to invoke a Python script via the shell and write an MP3 to a fixed workspace path, but it declares no permissions for shell execution or file writing. This creates a trust and containment gap: reviewers and policy engines may not realize the skill can execute commands and create files, increasing the chance of unsafe invocation or inadequate sandboxing.

Tp4

High
Category
MCP Tool Poisoning
Confidence
95% confidence
Finding
The skill promises intelligent Pinyin-to-Hanzi conversion, language detection, and safe voice selection, but the described execution path relies on caller-supplied segments and voices without implementing those controls. This mismatch is dangerous because users and orchestrators may trust the skill to normalize mixed-language content safely and accurately, when in reality untrusted input may flow directly into shell-invoked processing and produce misleading or unsafe behavior.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
The script forwards user-provided text to an external TTS program, which typically implies transmission to a remote service, without any notice, consent flow, or data-classification guardrails. In a skill context, users may provide sensitive text assuming local processing, so this creates a privacy and data-handling risk rather than code execution.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal