Security audit

Video Creator

Security checks across malware telemetry and agentic risk

Overview

This video-generation skill is mostly purpose-aligned, but it needs Review because it can upload sensitive voice/photo media, modify the Python environment, persist voice identifiers, and recursively delete output directories without strong controls.

Install only if you are comfortable with this skill uploading scripts, generated prompts, portrait photos, audio, and voice samples to platform.delilegal.com/OSS for cloud processing. Run it in a controlled environment with dependencies installed ahead of time, avoid sensitive or non-consented voice/photo material, review voice_config.json for stored API keys or voice IDs, and set the output path to a dedicated disposable directory because existing output directories are deleted.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output

Findings (25)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import httpx # noqa: F811 return httpx except ImportError: subprocess.run( [sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True, )
Confidence: 95% confidence
Finding: subprocess.run( [sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True, )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: except ImportError: log("httpx 或 certifi 未安装，正在安装...", "WARN") import subprocess subprocess.run([sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True) import httpx import certifi
Confidence: 93% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import certifi import hashlib except ImportError: subprocess.run([sys.executable, "-m", "pip", "install", "requests", "certifi", "-q"], check=True) import requests as _requests import certifi import hashlib
Confidence: 96% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "requests", "certifi", "-q"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import edge_tts except ImportError: log("edge-tts 未安装，正在安装...", "WARN") subprocess.run([sys.executable, "-m", "pip", "install", "edge-tts", "-q"], check=True) import edge_tts if gender == "male" and lang in EDGE_TTS_MALE_VOICES:
Confidence: 96% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "edge-tts", "-q"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: from dashscope_api import generate_image as api_gen_image import requests as _reqs except ImportError: subprocess.run([sys.executable, "-m", "pip", "install", "requests", "-q"], check=True) import requests as _reqs sys.path.insert(0, str(Path(__file__).parent)) from dashscope_api import generate_image as api_gen_image
Confidence: 96% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "requests", "-q"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: try: from PIL import Image, ImageDraw, ImageFont except ImportError: subprocess.run([sys.executable, "-m", "pip", "install", "Pillow", "-q"], check=True) from PIL import Image, ImageDraw, ImageFont images = []
Confidence: 95% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "Pillow", "-q"], check=True)

Tainted flow: 'upload_url' from requests.post (line 249, network input) → requests.put (network output)

Medium

Category: Data Flow
Content: put_headers["Content-Type"] = "application/octet-stream" with open(local_path, "rb") as f: file_data = f.read() resp2 = _requests.put(upload_url, headers=put_headers, data=file_data, timeout=180) if resp2.status_code not in (200, 204): raise RuntimeError(f"[步骤二] 上传到 OSS 失败（{resp2.status_code}）：{resp2.text[:200]}")
Confidence: 93% confidence
Finding: resp2 = _requests.put(upload_url, headers=put_headers, data=file_data, timeout=180)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill documents capabilities to read/write files, access environment variables, invoke shell commands, and make network requests, but it declares no permissions. This weakens sandboxing and user transparency because a host system may permit broader actions than users expect, especially for a skill that handles local media and credentials.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 94% confidence
Finding: The skill claims a relatively bounded purpose, but the documentation expands behavior to remote image generation, uploading local files to external services/OSS, and runtime dependency installation. This mismatch is dangerous because users and orchestrators may invoke the skill under false assumptions, allowing unanticipated data egress and code execution paths.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The manifest suggests the skill operates on user-provided images, but the body adds automatic image generation from script text when images are missing. This matters because it silently broadens external processing and may send user content to image-generation services without a clear expectation or dedicated consent.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: Automatic package installation via pip is unrelated to normal video-generation logic and introduces a supply-chain and arbitrary code execution risk during dependency installation. Because it happens implicitly at runtime, operators and users may not realize the skill is fetching and executing external package installer logic, which is dangerous in shared or production agent environments.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Installing Python packages during execution is an unsafe supply-chain pattern and is not necessary for normal operation of a voice-enrollment utility. It expands the attack surface by allowing execution of package install hooks and by pulling code from external repositories at runtime.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The skill description says it makes static slideshow videos from user-provided images, but the code silently generates missing images online when images are absent or counts do not match. That means user script content may be transmitted to external image-generation services beyond the declared behavior, creating a privacy and consent problem.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: In portrait mode, the script similarly auto-generates follow-on images online if user-supplied images are missing or incomplete. Because this exceeds the user-facing promise of operating on provided materials, it can leak script content and create unexpected third-party processing of user data.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The comments claim cloned voice IDs are only used in memory and not persisted, but synthesize_segment saves the voice_id into voice_config.json. Persisting identifiers for cloned voices increases privacy risk, creates undeclared retention of sensitive biometric-related metadata, and may allow reuse across future runs without fresh consent.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The trigger text is overly broad and says the skill must be used for common video-related requests. Overbroad mandatory routing increases the chance that user content, images, audio, and credentials are sent through this skill even when a lower-privilege or local-only option would suffice.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill describes uploading portrait photos and driving audio to external services and OSS to generate talking-head video, but it does not provide a prominent user-facing privacy warning or explicit consent step. Because these are highly sensitive biometric and voice-linked assets, undisclosed upload and processing materially increase privacy, compliance, and misuse risk.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The voice-cloning workflow instructs users to upload audio to publicly accessible or external services for enrollment, but it lacks a clear warning that biometric voice data is being transferred for remote processing. Voiceprints are sensitive personal data, and insufficient disclosure can lead to unauthorized cloning, privacy harm, and regulatory exposure.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The documentation instructs users to upload voice samples to a public URL and use a voice-cloning service, but it does not warn about privacy, consent, retention, or misuse risks for biometric voice data. Because voice samples are sensitive personal data and publicly accessible hosting expands exposure, this can lead to unauthorized access, replay, impersonation, or non-compliant data handling.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill installs packages without any user-facing notice or confirmation, causing hidden changes to the execution environment. This weakens operator control, impairs auditability, and can be abused indirectly if package sources or dependency resolution are compromised, especially in agent platforms where skills are expected to be declarative and minimally self-modifying.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The script unconditionally deletes the entire output directory with shutil.rmtree if it exists. In an agent context, a user-controlled or mistakenly broad output path could cause destructive loss of unrelated local files without confirmation, and the only safety check covers equality with the images directory, not other dangerous paths.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The helper uploads local audio files to remote OSS and is later used for voice cloning and portrait generation without a strong upfront consent gate in the code path. Because the skill processes highly sensitive inputs like voice samples and portrait photos, undisclosed remote upload materially increases privacy and exfiltration risk.

External Transmission

Medium

Category: Data Exfiltration
Content: save_url = "https://platform.delilegal.com/api/v1/file/saveFile" # 步骤一：获取上传临时链接 resp1 = _requests.post( prepare_url, headers=platform_headers, json={"fileHash": file_hash, "fileName": file_name},
Confidence: 91% confidence
Finding: requests.post( prepare_url, headers=platform_headers, json=

External Transmission

Medium

Category: Data Exfiltration
Content: raise RuntimeError(f"[步骤二] 上传到 OSS 失败（{resp2.status_code}）：{resp2.text[:200]}") # 步骤三：保存文件记录 resp3 = _requests.post( save_url, headers=platform_headers, json={"fileHash": file_hash, "originalName": file_name},
Confidence: 90% confidence
Finding: requests.post( save_url, headers=platform_headers, json=

Unvalidated Output Injection

High

Category: Output Handling
Content: output_path, ] ) result = subprocess.run(cmd, capture_output=True) if result.returncode != 0: stderr = result.stderr.decode(errors='replace') print(f"❌ 编码失败：{stderr[-500:]}")
Confidence: 82% confidence
Finding: subprocess.run(cmd, capture_output

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal