Auto Caption Online

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a legitimate cloud video captioning/editor integration, but its broad catch-all routing could send unrelated user prompts or media to a third-party backend.

Review before installing. Use it only when you intentionally want media and instructions sent to nemovideo.ai for cloud processing. Avoid sensitive, private, proprietary, or regulated footage unless you trust that service's retention and privacy practices. Be aware the skill's documented routing is broad, so ordinary prompts may be handled by the remote backend if the skill is active.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (5)

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The skill is presented as a narrow auto-captioning tool, but the documented behavior exposes a broader remote video-editing pipeline with generalized editing, rendering, and state-management actions. This scope expansion matters because users and host systems may grant trust, data access, or invocation rights based on the narrower manifest, while the actual capability supports more extensive processing than advertised.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The upload interface accepts additional media types and URL-based ingestion beyond the stated purpose of uploading video files for captioning. That mismatch increases the chance of unexpected data exfiltration to the third-party backend and undermines least-privilege expectations for a skill users believe only handles local video captioning.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The suggested trigger phrase "Or just tell me what you're thinking" is so broad that it could cause the skill to engage on ordinary conversation unrelated to video captioning. Overbroad activation increases the risk of accidental prompt capture, unintended routing of user content to a remote service, and confusion about when the skill is in control.

Vague Triggers

Medium
Confidence
96% confidence
Finding
The routing rule sends "Everything else" to the SSE action, creating an effectively unbounded catch-all path. In a skill that forwards prompts to a remote backend, this can capture arbitrary user input and trigger backend processing outside the user's reasonable expectation, making unintended data disclosure and unauthorized action routing more likely.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill describes cloud GPU processing but does not provide a clear, up-front warning that uploaded files and prompts are sent to an external service. Because this skill handles user media and free-form instructions, missing disclosure meaningfully increases privacy and consent risk, especially for sensitive or proprietary videos.

VirusTotal

56/56 vendors flagged this skill as clean.

View on VirusTotal