Free Auto Caption

Security checks across malware telemetry and agentic risk

Overview

This captioning skill is mostly transparent, but it can send broad or vague editing requests and uploaded media to a third-party cloud service beyond a narrow caption-only task.

Review before installing. Use this only with media you are comfortable uploading to nemovideo.ai, make requests explicitly caption-related, and avoid sending private videos, audio, images, or unrelated prompts until the publisher narrows the activation rules and clearly documents privacy and retention behavior.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (4)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill is advertised as a narrow auto-captioning tool, but the documented API surface and routing expose a broader cloud video editing workflow with arbitrary edit, export, audio, overlay, and state operations. This mismatch can mislead users and host platforms about the true capability scope, increasing the chance of unintended activation, over-permissioned use, and covert repurposing beyond the declared function.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The skill accepts and edits image and audio assets even though it is presented as a captioning tool for videos. Unsupported broad media handling expands the attack surface and permits functionality that users, reviewers, or policy gates may not expect, enabling the skill to operate as a more general media-processing pipeline than advertised.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The example prompt 'Or just tell me what you're thinking' is so broad that ordinary conversation could be interpreted as activation input for the skill. Ambiguous invocation language increases the risk of accidental triggering and unintended transmission of user requests or files into the remote processing workflow.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The routing rule that sends 'Everything else' to the SSE editing path creates ambiguous activation scope and effectively turns the skill into a catch-all for generic media-editing requests. In context, this is more dangerous because the backend supports remote session creation, uploads, state access, and export, so an imprecise match can trigger substantive cloud actions without a clear caption-specific intent.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal