UGC Manual

Security checks across malware telemetry and agentic risk

Overview

This skill performs the advertised lip-sync video workflow, but users should understand that selected face images and audio are sent to ComfyDeploy.

Install only if you are comfortable sending the chosen image and audio to ComfyDeploy. Use a limited ComfyDeploy API key if possible, avoid highly sensitive or non-consensual face and voice media, install ffmpeg from a trusted source, and prefer local trusted files or trusted URLs because remote audio is downloaded and decoded locally.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (7)

Lp3

Medium
Category
MCP Least Privilege
Confidence
85% confidence
Finding
The skill documentation exposes capabilities that imply shell execution, network access, and possible environment usage, but it declares no permissions or equivalent user-visible disclosure. That mismatch weakens trust boundaries and can cause an agent or user to invoke the skill without understanding that local commands and remote calls may occur.

Intent-Code Divergence

Medium
Confidence
88% confidence
Finding
The manifest says the skill is for a user's own audio recording, but later sections broaden use to externally generated TTS and even arbitrary audio. This inconsistency can bypass routing or policy decisions that rely on the manifest, causing the skill to be used on inputs with different privacy, consent, or copyright risk than initially disclosed.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
The skill metadata frames inputs as user-provided files, but the code also accepts arbitrary remote image and audio URLs. That mismatch expands the trust boundary and enables server-side fetching of attacker-chosen resources, which can be abused for SSRF-like access, unexpected data retrieval, or processing untrusted remote media without clear user consent.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The script downloads arbitrary remote audio URLs and then processes them locally with ffmpeg, adding a network-fetch capability beyond the stated purpose of using user-provided media. This increases exposure to SSRF, oversized download abuse, and malicious media parsing risks because attacker-controlled content is fetched and handed to a complex decoder.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill instructs users to provide images and audio that are sent to a third-party API, but it gives no user-facing warning that sensitive biometric-like face data and voice recordings leave the local environment. In this context, the omission is significant because the inputs are personally identifying media and may contain private or regulated content.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The skill uploads user image and audio content to a third-party service, but the provided skill description does not clearly warn about external transmission or data handling implications. Because the inputs may contain biometric voice data and personal imagery, undisclosed transfer to an external processor creates privacy and compliance risk even if the transmission is functionally necessary.

External Transmission

Medium
Category
Data Exfiltration
Content
## API Details

**Endpoint:** `https://api.comfydeploy.com/api/run/deployment/queue`
**Deployment ID:** `075ce7d3-81a6-4e3e-ab0e-7a25edf601b5`

## Required Inputs
Confidence
80% confidence
Finding
https://api.comfydeploy.com/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal