Auto Caption Online

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a legitimate cloud video captioning/editor integration, but its broad catch-all routing could send unrelated user prompts or media to a third-party backend.

Review before installing. Use it only when you intentionally want media and instructions sent to nemovideo.ai for cloud processing. Avoid sensitive, private, proprietary, or regulated footage unless you trust that service's retention and privacy practices. Be aware the skill's documented routing is broad, so ordinary prompts may be handled by the remote backend if the skill is active.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill is presented as a narrow auto-captioning tool, but the documented behavior exposes a broader remote video-editing pipeline with generalized editing, rendering, and state-management actions. This scope expansion matters because users and host systems may grant trust, data access, or invocation rights based on the narrower manifest, while the actual capability supports more extensive processing than advertised.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The upload interface accepts additional media types and URL-based ingestion beyond the stated purpose of uploading video files for captioning. That mismatch increases the chance of unexpected data exfiltration to the third-party backend and undermines least-privilege expectations for a skill users believe only handles local video captioning.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The suggested trigger phrase "Or just tell me what you're thinking" is so broad that it could cause the skill to engage on ordinary conversation unrelated to video captioning. Overbroad activation increases the risk of accidental prompt capture, unintended routing of user content to a remote service, and confusion about when the skill is in control.

Vague Triggers

Medium

Confidence: 96% confidence
Finding: The routing rule sends "Everything else" to the SSE action, creating an effectively unbounded catch-all path. In a skill that forwards prompts to a remote backend, this can capture arbitrary user input and trigger backend processing outside the user's reasonable expectation, making unintended data disclosure and unauthorized action routing more likely.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill describes cloud GPU processing but does not provide a clear, up-front warning that uploaded files and prompts are sent to an external service. Because this skill handles user media and free-form instructions, missing disclosure meaningfully increases privacy and consent risk, especially for sensitive or proprietary videos.

VirusTotal

56/56 vendors flagged this skill as clean.

View on VirusTotal