video-remix

Security checks across malware telemetry and agentic risk

Overview

This video-remix skill broadly does what it claims, but it needs review because it can automatically use logged-in browser sessions, send video-derived content to external AI/TTS services, install system tools, and expose generated files on the local network.

Install only if you are comfortable with the agent using a browser profile that may already be signed in, sending URLs and generated text to external AI/TTS services, installing media tools, downloading third-party video content, and serving the output directory to devices on your local network. Use an isolated environment, avoid sensitive or private videos, confirm you have rights to reuse the source media, and stop or disable the HTTP server when finished.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import

Findings (22)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: output_dir.mkdir(parents=True, exist_ok=True) def run_server(): subprocess.run( [sys.executable, "-m", "http.server", str(port)], cwd=str(output_dir), capture_output=False
Confidence: 92% confidence
Finding: subprocess.run( [sys.executable, "-m", "http.server", str(port)], cwd=str(output_dir), capture_output=False )

Tainted flow: 'cmd' from os.getenv (line 126, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: str(merged_video) ] subprocess.run(cmd, capture_output=True, check=True) print(f"✅ 合并完成：{merged_video}") return merged_video
Confidence: 97% confidence
Finding: subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: str(full_audio) ] subprocess.run(cmd, capture_output=True, check=True) print(f"✅ 配音合并完成：{full_audio}") return full_audio
Confidence: 97% confidence
Finding: subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: str(output_path) ] subprocess.run(cmd, capture_output=True, check=True) # 获取输出文件信息 result = subprocess.run(
Confidence: 97% confidence
Finding: subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: str(output_file) ] subprocess.run(cmd, capture_output=True, check=True) clip_files.append(output_file) # 合并片段
Confidence: 97% confidence
Finding: subprocess.run(cmd, capture_output=True, check=True)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill declares no permissions while instructing execution of shell commands, browser automation, file reads/writes, environment-variable use, and network access. This creates a trust gap: users or the platform cannot accurately evaluate what the skill will do before it modifies the host, contacts external services, and exposes outputs.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 90% confidence
Finding: The documented behavior does not fully match the skill's actual capabilities and referenced processing paths, including use of additional AI providers and external services beyond the stated Gemini-first workflow. Behavior mismatches are dangerous because they undermine informed consent and can cause unexpected data disclosure to third parties or execution of workflows the user did not approve.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The skill instructs privileged host modification via package-manager commands, including sudo apt install. For a media-processing skill, changing the system package state is a significant escalation that can alter the host environment, fail unpredictably, or be abused if the installation instructions are extended to more dangerous commands.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The skill automates a browser session against a third-party Gemini web interface, submits user-provided URLs, and extracts content from the page. This introduces privacy and integrity risks because data is sent to an external service through a GUI session, and the automation may interact with whatever account is logged in or with changing page elements in unsafe ways.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The downloader implicitly inherits HTTP_PROXY/HTTPS_PROXY from the environment, which can route requests through infrastructure the user did not explicitly choose. In an agent/skill context, this can leak requested URLs, metadata, and potentially proxy credentials or enterprise network context to external systems without clear disclosure.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The trigger phrases are broad and overlap with common video-editing or transcription requests, increasing the chance of unintended activation. In this skill, accidental invocation is more dangerous because activation can lead to browser automation, third-party data submission, downloads, shell execution, and starting a LAN HTTP server.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill automatically submits a YouTube URL to Gemini via browser automation without a clear privacy warning or consent checkpoint. Even if the URL seems innocuous, it may reveal user interests, private/unlisted links, workflow context, or account-associated metadata to a third-party service.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill advertises LAN HTTP sharing but does not prominently warn that generated files will be exposed over a network-accessible server. This is risky because processed media and related outputs may become reachable by other devices on the local network, enabling unintended disclosure of potentially sensitive content.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: Accessing proxy environment variables without user disclosure is a transparency and data-flow issue, especially for a media-downloading skill that may process arbitrary links. Users may not realize that traffic is being routed through a proxy or that embedded credentials/network identifiers could influence outbound requests.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code sends user-derived video/script content to external LLM providers via network APIs, but there is no explicit consent flow, disclosure, or transmission boundary check. In a video-processing skill, transcripts and segment summaries may contain private or copyrighted material, so silent export to third-party services creates a real confidentiality risk.

Missing User Warnings

Medium

Confidence: 79% confidence
Finding: The module sends user-provided text to Edge-TTS, which implies network transmission to an external service or dependency behavior, but there is no explicit warning, consent flow, or privacy notice in this component. In a video-processing skill, users may submit sensitive script content, transcripts, or unpublished material, so silent transmission increases privacy and data-handling risk.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The script automatically advertises a LAN URL and keeps an HTTP file server running indefinitely, exposing the entire output directory. In this skill, outputs may contain copyrighted downloads, generated subtitles/scripts, and possibly other sensitive files written to that directory, so silent publication materially increases data exposure risk.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill persists transcripts, analyses, and generated scripts to disk in a temp/output directory without any explicit user disclosure or consent flow. Those artifacts can contain sensitive spoken content, personal data, or copyrighted material, and they remain exposed to other local users/processes until cleanup occurs or indefinitely when errors happen or keep_intermediate is enabled.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The workflow sends derived content to external providers for script generation and TTS via configurable OpenAI-compatible endpoints and Edge TTS, but does not warn the user that transcript/script data may leave the local machine. In a video-processing skill, this materially increases privacy and compliance risk because the source video may contain personal, confidential, or regulated information that users may assume is processed locally.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The implementation loads the full transcription text and passes it as LLM context without minimization, even though only segment-level script generation is needed. This unnecessarily broadens data exposure to the model provider or local inference endpoint and increases the chance of leaking sensitive speech, names, or other embedded content.

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: # 0.3 平台依赖（Ubuntu/Debian） if command -v apt >/dev/null 2>&1; then sudo apt update -y sudo apt install -y ffmpeg yt-dlp libass-dev fi
Confidence: 95% confidence
Finding: sudo

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: # 0.3 平台依赖（Ubuntu/Debian） if command -v apt >/dev/null 2>&1; then sudo apt update -y sudo apt install -y ffmpeg yt-dlp libass-dev fi # 0.4 Python 依赖（幂等安装）
Confidence: 96% confidence
Finding: sudo

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal