Best Auto Caption

Security checks across malware telemetry and agentic risk

Overview

This skill is a real cloud captioning/editing workflow, but it is broader and more automatic than its caption-only presentation suggests.

Review this before installing if you only want a narrow captioning tool. It sends media and free-form edit instructions to Nemo Video's cloud service, can use or acquire a NEMO_TOKEN, and may create remote sessions automatically. Avoid private or sensitive recordings unless you trust that service and are comfortable with broader video-editing behavior beyond captions.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The skill is presented as a captioning tool, but its routing and workflow description expose broader video-editing behaviors such as editing, adding BGM, and manipulating timeline state. This scope mismatch can mislead users and platforms about what the skill is authorized to do, increasing the risk of unintended media processing and overbroad backend access.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The API documentation permits uploads of many non-video asset types and direct manipulation of full timeline drafts, which goes beyond the declared purpose of captioning uploaded video clips. Excess capability increases attack surface and enables the skill to act as a general media editor rather than a narrowly scoped captioning tool.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The skill instructs the agent to self-provision anonymous tokens and access credit/account-related endpoints even though its stated purpose is caption generation. That unnecessary authentication and account-management logic broadens privilege use and could enable abuse of trial credits or opaque backend access without clear user consent.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The example invocation phrase is generic enough to overlap with ordinary conversation, which can cause accidental triggering of the skill. Unintended activation is especially risky here because the skill can connect to a backend, create sessions, and process uploaded media automatically.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The catch-all routing rule sends nearly any non-matching request to the SSE editing path, making activation overly broad and unpredictable. In a skill with remote API access and media-processing actions, this can lead to unintended requests being transmitted to the backend or capabilities being used outside the user's actual intent.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal