digital-human-training

Security checks across malware telemetry and agentic risk

Overview

This is a documentation-only guide for building digital humans, but it involves sensitive voice and face data that users should handle carefully.

Install only if you are comfortable receiving guidance on voice and likeness cloning. Use recordings and videos only from yourself or people who gave clear informed consent, keep generated model files private, delete unused source media, review any cloud provider's privacy and retention terms, and disclose AI-generated avatars when others may interact with them.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (3)

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill explicitly guides users through voice cloning and digital-human training workflows but omits any warning about consent, identity misuse, or privacy handling of source audio/video. In this context, that omission is security-relevant because cloned voices and face/voice training data can directly enable impersonation, fraud, and misuse of biometric data.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The deployment section discusses cloud APIs and end-to-end training/deployment flows without warning that sensitive audio/video may be uploaded to third-party services. This is dangerous because users may unknowingly transmit biometric, personal, or confidential media to external providers without assessing retention, sharing, jurisdiction, or security practices.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The document instructs collection of a user's voice recording and face video to build a cloned voice and lip-synced digital human, but it omits any warning about consent, biometric data handling, retention, or misuse risks. In this specific skill context, the omission is more dangerous because the workflow directly enables realistic impersonation and processing of sensitive biometric identifiers, increasing the chance of privacy harm, fraud, or unauthorized identity simulation.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal