Text To Speech

Security checks across malware telemetry and agentic risk

Overview

This text-to-speech skill is coherent and purpose-aligned, with normal cautions around installing a remote CLI and sending text or media inputs to a hosted inference service.

Install only if you trust inference.sh and the infsh CLI. Prefer manual install with checksum verification if you want stronger assurance, and avoid submitting confidential text, regulated data, private URLs, or cloned voices without permission and an acceptable data-handling arrangement.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (2)

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger list contains broad phrases such as 'natural voice', 'voice ai', and 'generate speech' that may match ordinary user requests too aggressively. Over-broad invocation can cause the wrong skill to activate, increasing the chance that user content is sent to external services unexpectedly or that the agent performs unintended remote actions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The examples encourage users to submit text and later media URLs to inference.sh-hosted apps, but the skill does not prominently warn that this data leaves the local environment and is processed by a remote service. In a TTS context, inputs may contain sensitive scripts, PII, proprietary copy, or private media links, so lack of disclosure raises privacy and data-handling risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal