Generator From Audio

Security checks across malware telemetry and agentic risk

Overview

This audio-to-video skill is mostly coherent, but it can automatically create a third-party cloud session and route broad prompts or URL uploads to that service with limited user control.

Install only if you are comfortable with Nemovideo receiving your prompts, audio files, and any URLs you provide for processing. Avoid sensitive recordings, prefer setting your own NEMO_TOKEN, and confirm before uploads, URL imports, exports, or vague edit requests.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (4)

Context-Inappropriate Capability

Medium
Confidence
84% confidence
Finding
Allowing users to import media from arbitrary remote URLs expands the attack surface beyond uploaded local audio and is not clearly disclosed by the skill's stated purpose. This can enable the backend to fetch attacker-controlled URLs, creating SSRF-style risk, internal network probing, or ingestion of unexpected content through a feature users would not reasonably anticipate from an audio-to-video skill.

Vague Triggers

Medium
Confidence
88% confidence
Finding
Routing 'Everything else' to the SSE action means nearly any unmatched user request can trigger backend processing, making accidental invocation and unintended data transmission more likely. In a skill that connects automatically to a cloud service and sends free-form prompts to a remote API, overly broad matching increases the chance that unrelated conversation or sensitive text is forwarded off-platform.

Vague Triggers

Medium
Confidence
79% confidence
Finding
The phrase 'generate my audio files' is relatively broad and could overlap with ordinary language, increasing the chance of accidental skill activation. While not severe on its own, this becomes more concerning because activation leads to automatic backend connection and potential upload/processing flows with a cloud provider.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The skill instructs the agent to automatically connect to a cloud backend and obtain/use tokens, but the user-facing description does not clearly warn that uploaded audio and prompts will be sent to an external service. This lack of disclosure can lead users to share sensitive recordings without informed consent, which is especially risky for podcasts, interviews, or other potentially confidential audio.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal