Ai Image To Video Benchmark

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed cloud image-to-video skill, but its benchmarking claim and backend scope should be read carefully.

Install only if you are comfortable sending selected images, URLs, prompts, and generated media to NemoVideo's cloud backend. Avoid sensitive or confidential media, treat the benchmarking claim as single-provider output unless the skill separately provides comparison data, and confirm before using remote URL uploads or ambiguous editing requests.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The skill presents itself as a narrow image-to-video benchmarking tool, but the documented backend exposes a much broader media editing and export surface including timeline manipulation, audio, text, and multiple file formats. This mismatch can mislead users and host systems about what capabilities are actually being delegated to the remote service, weakening informed consent and policy enforcement.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill is marketed around local image uploads, but the upload API also accepts arbitrary remote URLs as media sources. URL-based ingestion materially changes the trust boundary because it can cause the backend to fetch third-party resources, enabling unexpected data transfer, internal resource access attempts, or processing of content the user did not directly upload.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The skill advertises benchmarking against other AI tools, but the documented workflow only describes a single backend generation/edit pipeline and no actual comparative evaluation logic. This is primarily a deceptive capability claim that can cause users to share assets or prompts under false assumptions about the service being performed.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: Routing 'everything else' to the SSE generation path creates an overly broad catch-all trigger that can capture unrelated or ambiguous user requests. In a skill that forwards prompts to a cloud backend, this increases the chance of unintended remote actions, overcollection of user input, and bypass of more specific safety or confirmation gates.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The setup and workflow instructions describe connecting to a cloud backend, creating sessions, and sending messages/files, but they do not provide a clear upfront warning that user prompts and uploaded media are transmitted to a third-party service. For a media-processing skill handling potentially sensitive images and text, missing disclosure undermines informed consent and can lead to privacy and data-handling violations.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal