Free Auto Caption

Security checks across malware telemetry and agentic risk

Overview

This captioning skill is mostly transparent, but it can send broad or vague editing requests and uploaded media to a third-party cloud service beyond a narrow caption-only task.

Review before installing. Use this only with media you are comfortable uploading to nemovideo.ai, make requests explicitly caption-related, and avoid sending private videos, audio, images, or unrelated prompts until the publisher narrows the activation rules and clearly documents privacy and retention behavior.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (4)

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The skill is advertised as a narrow auto-captioning tool, but the documented API surface and routing expose a broader cloud video editing workflow with arbitrary edit, export, audio, overlay, and state operations. This mismatch can mislead users and host platforms about the true capability scope, increasing the chance of unintended activation, over-permissioned use, and covert repurposing beyond the declared function.

Context-Inappropriate Capability

Medium
Confidence
92% confidence
Finding
The skill accepts and edits image and audio assets even though it is presented as a captioning tool for videos. Unsupported broad media handling expands the attack surface and permits functionality that users, reviewers, or policy gates may not expect, enabling the skill to operate as a more general media-processing pipeline than advertised.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The example prompt 'Or just tell me what you're thinking' is so broad that ordinary conversation could be interpreted as activation input for the skill. Ambiguous invocation language increases the risk of accidental triggering and unintended transmission of user requests or files into the remote processing workflow.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The routing rule that sends 'Everything else' to the SSE editing path creates ambiguous activation scope and effectively turns the skill into a catch-all for generic media-editing requests. In context, this is more dangerous because the backend supports remote session creation, uploads, state access, and export, so an imprecise match can trigger substantive cloud actions without a clear caption-specific intent.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal