Qwen3-TTS Voice Synthesis

Security checks across static analysis, malware telemetry, and agentic risk

Overview

This appears to be a real TTS skill, but it needs review because text can leave the machine through a default cloud fallback or a redirected ComfyUI endpoint.

Install only if you are comfortable with possible off-device text processing. Keep COMFYUI_URL set to localhost or another trusted ComfyUI server, avoid sensitive text unless you disable fallback with --fallback-edge false, and review or install the referenced tts-cosyvoice dependency before relying on fallback output.

SkillSpector (3)

By NVIDIA

Tainted flow: 'req' from os.environ.get (line 179, credential/environment) → urllib.request.urlopen (network output)

Critical
Category
Data Flow
Content
headers={"Content-Type": "application/json"},
    )
    try:
        with urllib.request.urlopen(req, timeout=30) as resp:
            result = json.loads(resp.read())
            return result.get("prompt_id")
    except (urllib.error.URLError, urllib.error.HTTPError) as e:
Confidence
83% confidence
Finding
with urllib.request.urlopen(req, timeout=30) as resp:

Tainted flow: 'url' from os.environ.get (line 198, credential/environment) → urllib.request.urlopen (network output)

Critical
Category
Data Flow
Content
while time.time() - start < timeout:
        try:
            url = f"{COMFYUI_URL}/history/{prompt_id}"
            with urllib.request.urlopen(url, timeout=10) as resp:
                history = json.loads(resp.read())
                if prompt_id in history:
                    status = history[prompt_id].get("status", {})
Confidence
83% confidence
Finding
with urllib.request.urlopen(url, timeout=10) as resp:

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill says it is a local TTS solution but does not prominently warn that failures trigger fallback to Edge TTS, which may send user-provided text off-device to a cloud service. This creates a meaningful privacy risk, especially for sensitive prompts, dialogue scripts, or cloned-voice use cases where users may reasonably expect local-only processing.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal