Text To Video Generation

Security checks across malware telemetry and agentic risk

Overview

This skill is a cloud text-to-video helper that sends user-provided prompts and files to NemoVideo for rendering, with some broad routing behavior users should understand.

Install only if you are comfortable sending prompts, uploaded documents, media, and edit instructions to NemoVideo's remote API. Avoid sensitive or proprietary material unless you trust that service's data handling, and be aware that the skill can create sessions, use credits, upload files, poll render state, and export videos through the backend.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The manifest markets a narrow text-to-video workflow, but the body documents broader upload, state inspection, render, and media-editing capabilities. This mismatch can cause users or orchestration layers to grant trust, permissions, or data access under a narrower expectation than the skill actually exercises, increasing the risk of unintended data handling and overbroad backend actions.

Context-Inappropriate Capability

Low

Confidence: 87% confidence
Finding: The skill derives `X-Skill-Platform` from local install paths and transmits that metadata to a remote service, which is not necessary for core text-to-video generation. Even if limited, this leaks environment characteristics that can support fingerprinting, telemetry collection, or future targeting of host-specific behavior.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The invocation phrases are very generic, such as 'generate my text prompt' and 'export 1080p MP4', which can overlap with ordinary conversation and unrelated tasks. Overbroad triggers increase the chance of accidental activation, causing unintended remote API calls, session creation, token issuance, or file handling without clear user intent.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The routing table ends with a catch-all rule sending 'Everything else' to SSE, effectively treating most unmatched input as permission to contact the backend. In context, that means arbitrary user text could be transmitted to a remote service and interpreted as editing commands, which expands the attack surface and weakens consent boundaries.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill asks users to upload text/files and create prompts but does not clearly warn that these inputs are sent to `mega-api-prod.nemovideo.ai` for processing. This is a real privacy and consent issue because users may provide sensitive documents assuming local handling, while the backend receives and stores the content remotely.

VirusTotal

52/52 vendors flagged this skill as clean.

View on VirusTotal