video-remix

Security checks across malware telemetry and agentic risk

Overview

This video-remix skill broadly does what it claims, but it needs review because it can automatically use logged-in browser sessions, send video-derived content to external AI/TTS services, install system tools, and expose generated files on the local network.

Install only if you are comfortable with the agent using a browser profile that may already be signed in, sending URLs and generated text to external AI/TTS services, installing media tools, downloading third-party video content, and serving the output directory to devices on your local network. Use an isolated environment, avoid sensitive or private videos, confirm you have rights to reuse the source media, and stop or disable the HTTP server when finished.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (22)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
output_dir.mkdir(parents=True, exist_ok=True)
    
    def run_server():
        subprocess.run(
            [sys.executable, "-m", "http.server", str(port)],
            cwd=str(output_dir),
            capture_output=False
Confidence
92% confidence
Finding
subprocess.run( [sys.executable, "-m", "http.server", str(port)], cwd=str(output_dir), capture_output=False )

Tainted flow: 'cmd' from os.getenv (line 126, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
str(merged_video)
    ]
    
    subprocess.run(cmd, capture_output=True, check=True)
    print(f"✅ 合并完成:{merged_video}")
    return merged_video
Confidence
97% confidence
Finding
subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
str(full_audio)
    ]
    
    subprocess.run(cmd, capture_output=True, check=True)
    print(f"✅ 配音合并完成:{full_audio}")
    return full_audio
Confidence
97% confidence
Finding
subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
str(output_path)
    ]
    
    subprocess.run(cmd, capture_output=True, check=True)
    
    # 获取输出文件信息
    result = subprocess.run(
Confidence
97% confidence
Finding
subprocess.run(cmd, capture_output=True, check=True)

Tainted flow: 'cmd' from os.getenv (line 200, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
str(output_file)
        ]
        
        subprocess.run(cmd, capture_output=True, check=True)
        clip_files.append(output_file)
    
    # 合并片段
Confidence
97% confidence
Finding
subprocess.run(cmd, capture_output=True, check=True)

Lp3

Medium
Category
MCP Least Privilege
Confidence
95% confidence
Finding
The skill declares no permissions while instructing execution of shell commands, browser automation, file reads/writes, environment-variable use, and network access. This creates a trust gap: users or the platform cannot accurately evaluate what the skill will do before it modifies the host, contacts external services, and exposes outputs.

Tp4

High
Category
MCP Tool Poisoning
Confidence
90% confidence
Finding
The documented behavior does not fully match the skill's actual capabilities and referenced processing paths, including use of additional AI providers and external services beyond the stated Gemini-first workflow. Behavior mismatches are dangerous because they undermine informed consent and can cause unexpected data disclosure to third parties or execution of workflows the user did not approve.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The skill instructs privileged host modification via package-manager commands, including sudo apt install. For a media-processing skill, changing the system package state is a significant escalation that can alter the host environment, fail unpredictably, or be abused if the installation instructions are extended to more dangerous commands.

Context-Inappropriate Capability

Medium
Confidence
84% confidence
Finding
The skill automates a browser session against a third-party Gemini web interface, submits user-provided URLs, and extracts content from the page. This introduces privacy and integrity risks because data is sent to an external service through a GUI session, and the automation may interact with whatever account is logged in or with changing page elements in unsafe ways.

Context-Inappropriate Capability

Low
Confidence
84% confidence
Finding
The downloader implicitly inherits HTTP_PROXY/HTTPS_PROXY from the environment, which can route requests through infrastructure the user did not explicitly choose. In an agent/skill context, this can leak requested URLs, metadata, and potentially proxy credentials or enterprise network context to external systems without clear disclosure.

Vague Triggers

Medium
Confidence
82% confidence
Finding
The trigger phrases are broad and overlap with common video-editing or transcription requests, increasing the chance of unintended activation. In this skill, accidental invocation is more dangerous because activation can lead to browser automation, third-party data submission, downloads, shell execution, and starting a LAN HTTP server.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill automatically submits a YouTube URL to Gemini via browser automation without a clear privacy warning or consent checkpoint. Even if the URL seems innocuous, it may reveal user interests, private/unlisted links, workflow context, or account-associated metadata to a third-party service.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill advertises LAN HTTP sharing but does not prominently warn that generated files will be exposed over a network-accessible server. This is risky because processed media and related outputs may become reachable by other devices on the local network, enabling unintended disclosure of potentially sensitive content.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
Accessing proxy environment variables without user disclosure is a transparency and data-flow issue, especially for a media-downloading skill that may process arbitrary links. Users may not realize that traffic is being routed through a proxy or that embedded credentials/network identifiers could influence outbound requests.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The code sends user-derived video/script content to external LLM providers via network APIs, but there is no explicit consent flow, disclosure, or transmission boundary check. In a video-processing skill, transcripts and segment summaries may contain private or copyrighted material, so silent export to third-party services creates a real confidentiality risk.

Missing User Warnings

Medium
Confidence
79% confidence
Finding
The module sends user-provided text to Edge-TTS, which implies network transmission to an external service or dependency behavior, but there is no explicit warning, consent flow, or privacy notice in this component. In a video-processing skill, users may submit sensitive script content, transcripts, or unpublished material, so silent transmission increases privacy and data-handling risk.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The script automatically advertises a LAN URL and keeps an HTTP file server running indefinitely, exposing the entire output directory. In this skill, outputs may contain copyrighted downloads, generated subtitles/scripts, and possibly other sensitive files written to that directory, so silent publication materially increases data exposure risk.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill persists transcripts, analyses, and generated scripts to disk in a temp/output directory without any explicit user disclosure or consent flow. Those artifacts can contain sensitive spoken content, personal data, or copyrighted material, and they remain exposed to other local users/processes until cleanup occurs or indefinitely when errors happen or keep_intermediate is enabled.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The workflow sends derived content to external providers for script generation and TTS via configurable OpenAI-compatible endpoints and Edge TTS, but does not warn the user that transcript/script data may leave the local machine. In a video-processing skill, this materially increases privacy and compliance risk because the source video may contain personal, confidential, or regulated information that users may assume is processed locally.

Ssd 3

Medium
Confidence
96% confidence
Finding
The implementation loads the full transcription text and passes it as LLM context without minimization, even though only segment-level script generation is needed. This unnecessarily broadens data exposure to the model provider or local inference endpoint and increases the chance of leaking sensitive speech, names, or other embedded content.

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
# 0.3 平台依赖(Ubuntu/Debian)
if command -v apt >/dev/null 2>&1; then
  sudo apt update -y
  sudo apt install -y ffmpeg yt-dlp libass-dev
fi
Confidence
95% confidence
Finding
sudo

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
# 0.3 平台依赖(Ubuntu/Debian)
if command -v apt >/dev/null 2>&1; then
  sudo apt update -y
  sudo apt install -y ffmpeg yt-dlp libass-dev
fi

# 0.4 Python 依赖(幂等安装)
Confidence
96% confidence
Finding
sudo

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal