Caption Generator Davinci Resolve

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only skill for cloud video captioning and rendering, with some broad editing behavior users should understand before uploading media.

Install only if you are comfortable sending uploaded media, prompts, and related render/session metadata to nemovideo.ai. Avoid sensitive or regulated footage unless you have verified that provider's privacy and retention terms. For tighter control, provide your own NEMO_TOKEN and use explicit caption/export prompts rather than broad editing requests.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (6)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill is presented as a caption generator, but the routing logic and documented actions expose a much broader remote media-editing surface, including overlays, audio edits, timeline manipulation, and export workflows. That mismatch can cause users to share data or invoke actions they did not reasonably expect, increasing the chance of unintended remote processing and overbroad tool use.

Description-Behavior Mismatch

Low

Confidence: 78% confidence
Finding: Accepting images and audio files exceeds the stated workflow of uploading video clips for captioned-video output and expands the data types sent to the remote backend. While not inherently malicious, this scope creep increases privacy and consent risk because users may not understand that non-video media is also in scope for upload and processing.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The file documents generalized remote editing and media-composition endpoints that go beyond caption generation, including state inspection, rendering, and composition behavior. In context, this broad capability set increases the blast radius of accidental or manipulated requests and makes the skill materially more powerful than its user-facing description suggests.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: Trigger phrases such as 'generate my video clips' and 'export 1080p MP4' are broad and overlap with ordinary user language, which raises the risk of accidental invocation. In a skill that uploads media and performs remote processing, ambiguous activation can lead to unintentional data transfer or actions the user did not mean to authorize.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The routing table contains a catch-all rule that sends 'everything else' to the SSE editing pathway, effectively granting broad backend action on ambiguous prompts. Because this skill can create sessions, upload media, manipulate state, and initiate remote edits, the catch-all significantly increases the chance of unintended or overbroad remote operations.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill does not prominently warn users that their uploaded media and prompts are transmitted to a third-party remote backend for processing. Given that videos may contain sensitive visual, audio, or embedded metadata, the lack of explicit disclosure undermines informed consent and elevates privacy risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal