Flyworks Avatar Video

Security checks across malware telemetry and agentic risk

Overview

The skill does what it says, but it handles face and voice media in ways that need careful review before use.

Review before installing. Use your own Flyworks/HiFly token, treat it as a secret, and assume any local image or audio path you provide may be uploaded for remote processing. Only upload portraits and voice samples you own or have explicit permission to use, and check the provider's retention, deletion, and acceptable-use terms before processing sensitive media.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (9)

Tainted flow: 'upload_url' from requests.post (line 137, network input) → requests.put (network output)

Medium

Category: Data Flow
Content: print(f"Uploading {file_path.name}...") with open(file_path, 'rb') as f: headers_put = {"Content-Type": content_type} resp_put = requests.put(upload_url, data=f, headers=headers_put) resp_put.raise_for_status() print(f"Upload successful. File ID: {file_id}")
Confidence: 95% confidence
Finding: resp_put = requests.put(upload_url, data=f, headers=headers_put)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill documentation indicates it uses environment variables, local file reads/writes, and network access, but the skill does not declare permissions or capabilities explicitly. This can mislead agents and users about the trust boundary, causing them to expose secrets such as API tokens, access local files, or make outbound requests without informed consent or proper policy enforcement.

Context-Inappropriate Capability

Medium

Confidence: 99% confidence
Finding: The script embeds a working default API token and silently uses it whenever the user has not configured their own credential. Hardcoded credentials weaken access control, normalize unaudited outbound network use, and may let anyone invoke the vendor API under a shared account without explicit user setup or attribution.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README promotes voice cloning and talking-photo generation using personal audio and images without any warning about consent, privacy, impersonation, or biometric-data sensitivity. In a skill specifically designed to upload and transform likeness and voice data, omission of these safeguards increases the risk of non-consensual deepfake creation and unsafe handling of sensitive personal media.

Missing User Warnings

Low

Confidence: 76% confidence
Finding: The README tells users to export an API token but does not mention that the token is a secret, should not be committed to source control, or should be stored securely. While this is common setup guidance, lack of handling warnings can lead to accidental credential exposure in shells, logs, screenshots, or repositories.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The documentation encourages creating talking-photo avatars from arbitrary portrait images, including remote URLs, but provides no warning to obtain consent, respect biometric/privacy rights, or avoid uploading sensitive personal photos. In a digital-human/video-generation context, this omission materially increases the risk of unauthorized likeness use, impersonation, and privacy violations because users are directly guided to transform real people’s images into speaking avatars.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation instructs users to upload pre-recorded audio to Flyworks/HiFly, an external service, but provides no warning that voice data and related content will be transmitted to a third party. Because audio may contain biometric voice data, personal information, or confidential material, this omission can cause users to unknowingly expose sensitive data to an external processor.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill provides operational instructions for cloning a voice from local files or remote URLs but omits any warning about consent, biometric privacy, ownership, or the fact that audio may be transmitted to a third-party service. Because voiceprints are sensitive biometric-like data and cloned voices can be misused for impersonation or fraud, presenting this workflow without safeguards increases the chance of unsafe or unauthorized use.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: When a user supplies a local image or audio path, the script uploads that file to a remote service automatically, with no explicit warning or consent step at the point of transmission. In an agent-skill context, this is more dangerous because users may assume local-path handling is local-only, leading to unintended disclosure of sensitive media.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal