Text To Speech

Security checks across malware telemetry and agentic risk

Overview

This text-to-speech skill is coherent and purpose-aligned, with normal cautions around installing a remote CLI and sending text or media inputs to a hosted inference service.

Install only if you trust inference.sh and the infsh CLI. Prefer manual install with checksum verification if you want stronger assurance, and avoid submitting confidential text, regulated data, private URLs, or cloned voices without permission and an acceptable data-handling arrangement.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (2)

Vague Triggers

Medium
Confidence
84% confidence
Finding
The trigger list contains broad phrases such as 'natural voice', 'voice ai', and 'generate speech' that may match ordinary user requests too aggressively. Over-broad invocation can cause the wrong skill to activate, increasing the chance that user content is sent to external services unexpectedly or that the agent performs unintended remote actions.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The examples encourage users to submit text and later media URLs to inference.sh-hosted apps, but the skill does not prominently warn that this data leaves the local environment and is processed by a remote service. In a TTS context, inputs may contain sensitive scripts, PII, proprietary copy, or private media links, so lack of disclosure raises privacy and data-handling risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal