Caption Burner

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a real cloud video-captioning workflow, but it routes broad editing requests and user media to a remote service with limited user-facing control or privacy detail.

Review before installing. Use it only for media you are comfortable sending to NemoVideo's remote service, and avoid confidential videos, faces, voices, or proprietary content unless the publisher documents token use, data retention, deletion, and account/credit behavior clearly.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill is presented as a narrow caption-burning utility, but the documented routing and action model supports broader video-editing behaviors such as overlays, audio changes, aspect-ratio handling, state inspection, and iterative project manipulation. This scope mismatch is dangerous because users and reviewers may grant trust, permissions, or data access based on the narrower description while the skill actually enables a more capable remote editing workflow.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The documented API allows arbitrary URL-based uploads and supports many asset types beyond the advertised video-caption use case. This increases risk because a seemingly simple caption tool can cause remote retrieval of third-party resources and process unexpected content classes, expanding data exposure and operational scope beyond user expectations.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: Persistent sessions, timeline state, iterative edits, and render-job export indicate a stateful cloud editing environment that is materially broader than a one-shot caption-burn workflow. This is risky because retained project state and cloud job lifecycle can expose more user data and create longer-lived remote processing than the skill's simple framing suggests.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to connect to a remote backend and obtain authentication tokens before handling requests, but it does not clearly warn users that their prompts, identifiers, and potentially uploaded media will be transmitted to a third-party service. This lack of informed consent is dangerous for privacy because users may share sensitive media believing the tool operates locally or with minimal external disclosure.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The upload, state, and export workflow clearly involves remote storage and cloud rendering of user media, yet the skill lacks a prominent privacy and data-handling disclosure. In a media-processing skill, this context makes the issue more serious because uploaded videos may contain faces, voices, location data, or other sensitive personal information.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal