Auto Video Editing

Security checks across malware telemetry and agentic risk

Overview

This is a coherent local video-editing skill with disclosed FFmpeg, Whisper, generated media files, and expected model/font download behavior.

Install only if you are comfortable running local FFmpeg/Python processing on media you choose. In restricted environments, preinstall and pin Whisper models, Python packages, and fonts from approved sources, disable or control mirrors, and avoid running it in directories where overwriting generated files like *_audio.wav or *_clips would be a problem.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (10)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
]

    try:
        subprocess.run(cmd, check=True, capture_output=True, text=True)
        print(f"Done: {output_path}")
    except subprocess.CalledProcessError as e:
        print(f"FFmpeg error:\n{e.stderr}", file=sys.stderr)
Confidence
85% confidence
Finding
subprocess.run(cmd, check=True, capture_output=True, text=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"-vf", drawtext,
            cover_path,
        ]
        subprocess.run(cmd, check=True, capture_output=True, text=True)

        # Replace first frame in video
        print(f"Replacing first frame with cover...")
Confidence
78% confidence
Finding
subprocess.run(cmd, check=True, capture_output=True, text=True)

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill clearly instructs use of shell commands, reads and writes files in user-provided directories, inspects environment/platform details, and may access the network for model/font downloads, yet the manifest does not declare corresponding permissions. This creates a transparency and consent problem: an agent may invoke a broadly capable skill without users or policy layers understanding its real reach.

Tp4

High
Category
MCP Tool Poisoning
Confidence
84% confidence
Finding
The declared description focuses on local video editing, but the documented behavior also includes downloading fonts, using external mirrors, locale-based endpoint switching, subtitle burning, and chapter-bar generation. This mismatch can cause an agent or user to authorize the skill for a narrower purpose than what it actually does, especially where outbound network access is involved.

Context-Inappropriate Capability

Medium
Confidence
80% confidence
Finding
The documentation introduces external downloads for fonts and model mirrors, which are not strictly necessary for a local video-editing workflow and expand the attack surface to third-party content sources. Downloading executable-adjacent assets or model files from mirrors/CDNs increases supply-chain and privacy risk, particularly when triggered automatically based on locale.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
The utility code performs outbound network access and automatically switches package/model endpoints, including downloading a font from third-party URLs and honoring HF_ENDPOINT from the environment. In a video-editing skill this expands trust boundaries beyond the declared core function and can expose users to supply-chain risk, tracking, or retrieval of untrusted assets without explicit consent.

Missing User Warnings

Low
Confidence
82% confidence
Finding
The README states that models/fonts may be downloaded automatically and that region-specific mirrors may be contacted, but it does not clearly warn users up front that running the skill can trigger outbound network access and third-party downloads. In an agent-driven workflow, this matters because users may assume local-only media processing while the skill silently reaches external services, creating privacy, policy, or supply-chain risk.

Missing User Warnings

Medium
Confidence
78% confidence
Finding
The skill states that it creates multiple derived files in the source video directory, but it does not prominently warn users up front that processing will generate and modify numerous artifacts there. In practice this can overwrite expectations, clutter important directories, and increase risk when operating on sensitive or synchronized folders.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The script unconditionally passes '-y' to ffmpeg and deterministically writes to '<video_name>_audio.wav', so an existing file at that path will be silently overwritten. In an automated editing skill, this can cause data loss or accidental destruction of prior outputs, especially when rerunning jobs on the same input directory.

Natural-Language Policy Violations

Medium
Confidence
81% confidence
Finding
The skill automatically changes network behavior based on heuristics about the user's locale instead of requiring explicit opt-in. In this file that means package/model traffic may be redirected to alternate mirrors or endpoints, which can materially change the software supply chain and user expectations without consent.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal