qui-edge-tts

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed text-to-speech skill that uses an API key and sends requested text to an external TTS service, with no evidence of hidden or unrelated behavior.

Install only if you trust the SkillBoss/HeyBoss TTS service with the text you ask it to speak. Use a scoped API key, keep it out of committed files and shared shell history where possible, rotate it if exposed, and clean up generated audio or subtitle files when they may contain private content.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill uses sensitive capabilities beyond what it explicitly declares: it requires an environment secret (`SKILLBOSS_API_KEY`) and performs network access to an external TTS service, but those capabilities are not formally declared as permissions. This creates a transparency and policy-enforcement gap, making it easier for the skill to access secrets or exfiltrate user-provided text without reviewers or runtime controls clearly understanding its privilege needs.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The document tells users to export a required API key but provides no guidance on secure handling, storage, or rotation of that credential. In practice, this can lead to accidental exposure through shell history, shared terminals, screenshots, checked-in config files, or misuse in multi-user environments, resulting in unauthorized API usage and billing or service abuse.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The activation guidance is broad enough that the skill may trigger on generic mentions of audio, accessibility, multitasking, or the keyword "tts," causing text to be sent to an external speech service in contexts the user did not intend. In a skill that performs network-backed processing of potentially sensitive text, ambiguous triggering increases the risk of accidental disclosure and unexpected behavior.

Missing User Warnings

Low

Confidence: 93% confidence
Finding: The guide documents sending arbitrary user text to Microsoft Edge's online TTS service and saving generated media/subtitles locally, but it does not clearly warn that prompts may leave the local environment and may contain sensitive or regulated data. In a bot/agent context, this can lead to accidental disclosure of private user content, especially because subtitle JSON and audio files create additional persisted copies of the data.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal