Auto Video Editing

Security checks across malware telemetry and agentic risk

Overview

This is a coherent local video-editing skill with disclosed FFmpeg, Whisper, generated media files, and expected model/font download behavior.

Install only if you are comfortable running local FFmpeg/Python processing on media you choose. In restricted environments, preinstall and pin Whisper models, Python packages, and fonts from approved sources, disable or control mirrors, and avoid running it in directories where overwriting generated files like *_audio.wav or *_clips would be a problem.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (10)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: ] try: subprocess.run(cmd, check=True, capture_output=True, text=True) print(f"Done: {output_path}") except subprocess.CalledProcessError as e: print(f"FFmpeg error:\n{e.stderr}", file=sys.stderr)
Confidence: 85% confidence
Finding: subprocess.run(cmd, check=True, capture_output=True, text=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: "-vf", drawtext, cover_path, ] subprocess.run(cmd, check=True, capture_output=True, text=True) # Replace first frame in video print(f"Replacing first frame with cover...")
Confidence: 78% confidence
Finding: subprocess.run(cmd, check=True, capture_output=True, text=True)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 90% confidence
Finding: The skill clearly instructs use of shell commands, reads and writes files in user-provided directories, inspects environment/platform details, and may access the network for model/font downloads, yet the manifest does not declare corresponding permissions. This creates a transparency and consent problem: an agent may invoke a broadly capable skill without users or policy layers understanding its real reach.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 84% confidence
Finding: The declared description focuses on local video editing, but the documented behavior also includes downloading fonts, using external mirrors, locale-based endpoint switching, subtitle burning, and chapter-bar generation. This mismatch can cause an agent or user to authorize the skill for a narrower purpose than what it actually does, especially where outbound network access is involved.

Context-Inappropriate Capability

Medium

Confidence: 80% confidence
Finding: The documentation introduces external downloads for fonts and model mirrors, which are not strictly necessary for a local video-editing workflow and expand the attack surface to third-party content sources. Downloading executable-adjacent assets or model files from mirrors/CDNs increases supply-chain and privacy risk, particularly when triggered automatically based on locale.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The utility code performs outbound network access and automatically switches package/model endpoints, including downloading a font from third-party URLs and honoring HF_ENDPOINT from the environment. In a video-editing skill this expands trust boundaries beyond the declared core function and can expose users to supply-chain risk, tracking, or retrieval of untrusted assets without explicit consent.

Missing User Warnings

Low

Confidence: 82% confidence
Finding: The README states that models/fonts may be downloaded automatically and that region-specific mirrors may be contacted, but it does not clearly warn users up front that running the skill can trigger outbound network access and third-party downloads. In an agent-driven workflow, this matters because users may assume local-only media processing while the skill silently reaches external services, creating privacy, policy, or supply-chain risk.

Missing User Warnings

Medium

Confidence: 78% confidence
Finding: The skill states that it creates multiple derived files in the source video directory, but it does not prominently warn users up front that processing will generate and modify numerous artifacts there. In practice this can overwrite expectations, clutter important directories, and increase risk when operating on sensitive or synchronized folders.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The script unconditionally passes '-y' to ffmpeg and deterministically writes to '<video_name>_audio.wav', so an existing file at that path will be silently overwritten. In an automated editing skill, this can cause data loss or accidental destruction of prior outputs, especially when rerunning jobs on the same input directory.

Natural-Language Policy Violations

Medium

Confidence: 81% confidence
Finding: The skill automatically changes network behavior based on heuristics about the user's locale instead of requiring explicit opt-in. In this file that means package/model traffic may be redirected to alternate mirrors or endpoints, which can materially change the software supply chain and user expectations without consent.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal