Best Video Frames

Security checks across malware telemetry and agentic risk

Overview

This is a cloud video-processing skill that should be reviewed because it is framed as frame extraction but gives the agent broad upload, editing, rendering, and export authority on a third-party backend.

Install only if you are comfortable sending videos, prompts, and project/session data to the NemoVideo cloud backend. Avoid confidential footage unless you have reviewed the service’s privacy, retention, billing, and data-use terms, and prefer explicit confirmation before any upload, edit, or export.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (6)

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
The skill is presented as a narrow frame-extraction utility, but the body documents a much broader remote video-editing and export pipeline with timeline manipulation, audio/text handling, SSE-driven edits, and full MP4 rendering. This scope mismatch can mislead users and host agents into granting files, network access, or consent for actions far beyond the advertised purpose, increasing the chance of unauthorized cloud processing and data disclosure.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
The skill includes anonymous token issuance, session creation, credits/subscription logic, and generalized cloud-service access that exceed what a simple local frame-extraction workflow would require. While these may be operationally legitimate for the backend, bundling them without clear justification expands attack surface, enables unexpected third-party account/session creation, and obscures the true data flow to users.

Intent-Code Divergence

High
Confidence
95% confidence
Finding
The document repeatedly markets 'top frames' extraction while also describing compositing and exporting rendered MP4 videos, which is a materially different operation. This inconsistency creates deceptive behavior risk: users may believe they are only extracting still frames when in fact full media uploads, edits, rendering jobs, and downloadable outputs are being produced on a remote service.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The trigger examples are broad and generic enough that the skill could activate on common video-related phrases outside the user's intended context. In combination with the skill's remote upload and session behavior, accidental invocation could send media or initiate cloud operations without sufficiently specific user intent.

Vague Triggers

Medium
Confidence
91% confidence
Finding
The catch-all routing rule sends 'everything else' to the SSE editing path, which is overly permissive and ambiguous for a skill advertised as simple frame extraction. Such broad routing increases the chance that unrelated or underspecified user text is interpreted as an instruction to perform remote editing actions, potentially causing unintended processing or disclosure of uploaded content.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill description and setup do not prominently warn users that their video files and prompts are transmitted to a third-party cloud backend for processing. Because video content often contains sensitive personal, biometric, or proprietary information, inadequate disclosure undermines informed consent and creates meaningful privacy and compliance risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal