Nano Banana Video Generator Free

Security checks across malware telemetry and agentic risk

Overview

This cloud video skill is mostly purpose-aligned, but it auto-creates a third-party session and broadly routes vague user input to the remote backend with limited consent.

Install only if you are comfortable with NemoVideo receiving prompts, uploaded files or URLs, editing instructions, session state, and render metadata. Avoid sensitive media, keep NEMO_TOKEN private, and use explicit video-generation requests so unrelated text is not routed to the remote service.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (6)

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The manifest markets a narrow text-to-video generator, but the body of the skill grants much broader capabilities including uploads, editing, timeline/state inspection, and export orchestration. This scope mismatch can mislead users and hosts about what data and actions the skill can access, undermining informed consent and making abuse or over-collection harder to detect.

Context-Inappropriate Capability

Low
Confidence
83% confidence
Finding
Credit-balance retrieval is not necessary for the core advertised function of generating banana videos from text prompts, so it expands access to account-related information without clear need. Even if low sensitivity, exposing or querying account state beyond user expectations is a privacy and least-privilege problem.

Context-Inappropriate Capability

Low
Confidence
87% confidence
Finding
General session-state and timeline inspection exceeds the simple text-to-video promise and may expose uploaded content metadata, render state, or prior editing context. This increases the amount of user/project information the skill can access without that broader access being clearly disclosed.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The suggested activation language is broad enough to match ordinary conversation, which can trigger the skill unintentionally. In this skill, accidental activation is more concerning because it can automatically connect to a backend, create tokens/sessions, and initiate network activity without meaningful user intent.

Vague Triggers

Medium
Confidence
93% confidence
Finding
The catch-all routing rule sends nearly any non-matching input into SSE-driven generation/edit behavior, making it easy for unrelated or ambiguous user text to trigger remote processing. Because the skill also auto-connects and maintains session state, ambiguous routing raises the risk of unintended uploads, edits, or disclosure of user content to the backend.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill instructs automatic backend connection plus anonymous token and session creation on first open, but only says to keep setup communication brief. That means network activity, token issuance, and session establishment may happen before the user receives a meaningful warning or consents, which is especially risky in a skill that can later upload files and inspect session state.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal