Nano Banana Video Generator Free

Security checks across malware telemetry and agentic risk

Overview

This cloud video skill is mostly purpose-aligned, but it auto-creates a third-party session and broadly routes vague user input to the remote backend with limited consent.

Install only if you are comfortable with NemoVideo receiving prompts, uploaded files or URLs, editing instructions, session state, and render metadata. Avoid sensitive media, keep NEMO_TOKEN private, and use explicit video-generation requests so unrelated text is not routed to the remote service.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (6)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The manifest markets a narrow text-to-video generator, but the body of the skill grants much broader capabilities including uploads, editing, timeline/state inspection, and export orchestration. This scope mismatch can mislead users and hosts about what data and actions the skill can access, undermining informed consent and making abuse or over-collection harder to detect.

Context-Inappropriate Capability

Low

Confidence: 83% confidence
Finding: Credit-balance retrieval is not necessary for the core advertised function of generating banana videos from text prompts, so it expands access to account-related information without clear need. Even if low sensitivity, exposing or querying account state beyond user expectations is a privacy and least-privilege problem.

Context-Inappropriate Capability

Low

Confidence: 87% confidence
Finding: General session-state and timeline inspection exceeds the simple text-to-video promise and may expose uploaded content metadata, render state, or prior editing context. This increases the amount of user/project information the skill can access without that broader access being clearly disclosed.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The suggested activation language is broad enough to match ordinary conversation, which can trigger the skill unintentionally. In this skill, accidental activation is more concerning because it can automatically connect to a backend, create tokens/sessions, and initiate network activity without meaningful user intent.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The catch-all routing rule sends nearly any non-matching input into SSE-driven generation/edit behavior, making it easy for unrelated or ambiguous user text to trigger remote processing. Because the skill also auto-connects and maintains session state, ambiguous routing raises the risk of unintended uploads, edits, or disclosure of user content to the backend.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill instructs automatic backend connection plus anonymous token and session creation on first open, but only says to keep setup communication brief. That means network activity, token issuance, and session establishment may happen before the user receives a meaningful warning or consents, which is especially risky in a skill that can later upload files and inspect session state.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal