Generator From Audio

Security checks across malware telemetry and agentic risk

Overview

This audio-to-video skill is mostly coherent, but it can automatically create a third-party cloud session and route broad prompts or URL uploads to that service with limited user control.

Install only if you are comfortable with Nemovideo receiving your prompts, audio files, and any URLs you provide for processing. Avoid sensitive recordings, prefer setting your own NEMO_TOKEN, and confirm before uploads, URL imports, exports, or vague edit requests.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: Allowing users to import media from arbitrary remote URLs expands the attack surface beyond uploaded local audio and is not clearly disclosed by the skill's stated purpose. This can enable the backend to fetch attacker-controlled URLs, creating SSRF-style risk, internal network probing, or ingestion of unexpected content through a feature users would not reasonably anticipate from an audio-to-video skill.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: Routing 'Everything else' to the SSE action means nearly any unmatched user request can trigger backend processing, making accidental invocation and unintended data transmission more likely. In a skill that connects automatically to a cloud service and sends free-form prompts to a remote API, overly broad matching increases the chance that unrelated conversation or sensitive text is forwarded off-platform.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The phrase 'generate my audio files' is relatively broad and could overlap with ordinary language, increasing the chance of accidental skill activation. While not severe on its own, this becomes more concerning because activation leads to automatic backend connection and potential upload/processing flows with a cloud provider.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill instructs the agent to automatically connect to a cloud backend and obtain/use tokens, but the user-facing description does not clearly warn that uploaded audio and prompts will be sent to an external service. This lack of disclosure can lead users to share sensitive recordings without informed consent, which is especially risky for podcasts, interviews, or other potentially confidential audio.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal