Back to skill

Security audit

Video Creator

Security checks across malware telemetry and agentic risk

Overview

This video-generation skill is mostly purpose-aligned, but it needs Review because it can upload sensitive voice/photo media, modify the Python environment, persist voice identifiers, and recursively delete output directories without strong controls.

Install only if you are comfortable with this skill uploading scripts, generated prompts, portrait photos, audio, and voice samples to platform.delilegal.com/OSS for cloud processing. Run it in a controlled environment with dependencies installed ahead of time, avoid sensitive or non-consented voice/photo material, review voice_config.json for stored API keys or voice IDs, and set the output path to a dedicated disposable directory because existing output directories are deleted.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Findings (25)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
import httpx  # noqa: F811
        return httpx
    except ImportError:
        subprocess.run(
            [sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"],
            check=True,
        )
Confidence
95% confidence
Finding
subprocess.run( [sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True, )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
except ImportError:
        log("httpx 或 certifi 未安装,正在安装...", "WARN")
        import subprocess
        subprocess.run([sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True)
        import httpx
        import certifi
Confidence
93% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "httpx", "certifi", "-q"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
import certifi
        import hashlib
    except ImportError:
        subprocess.run([sys.executable, "-m", "pip", "install", "requests", "certifi", "-q"], check=True)
        import requests as _requests
        import certifi
        import hashlib
Confidence
96% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "requests", "certifi", "-q"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
import edge_tts
    except ImportError:
        log("edge-tts 未安装,正在安装...", "WARN")
        subprocess.run([sys.executable, "-m", "pip", "install", "edge-tts", "-q"], check=True)
        import edge_tts

    if gender == "male" and lang in EDGE_TTS_MALE_VOICES:
Confidence
96% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "edge-tts", "-q"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
from dashscope_api import generate_image as api_gen_image
        import requests as _reqs
    except ImportError:
        subprocess.run([sys.executable, "-m", "pip", "install", "requests", "-q"], check=True)
        import requests as _reqs
        sys.path.insert(0, str(Path(__file__).parent))
        from dashscope_api import generate_image as api_gen_image
Confidence
96% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "requests", "-q"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
try:
        from PIL import Image, ImageDraw, ImageFont
    except ImportError:
        subprocess.run([sys.executable, "-m", "pip", "install", "Pillow", "-q"], check=True)
        from PIL import Image, ImageDraw, ImageFont

    images = []
Confidence
95% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "Pillow", "-q"], check=True)

Tainted flow: 'upload_url' from requests.post (line 249, network input) → requests.put (network output)

Medium
Category
Data Flow
Content
put_headers["Content-Type"] = "application/octet-stream"
        with open(local_path, "rb") as f:
            file_data = f.read()
        resp2 = _requests.put(upload_url, headers=put_headers, data=file_data, timeout=180)
        if resp2.status_code not in (200, 204):
            raise RuntimeError(f"[步骤二] 上传到 OSS 失败({resp2.status_code}):{resp2.text[:200]}")
Confidence
93% confidence
Finding
resp2 = _requests.put(upload_url, headers=put_headers, data=file_data, timeout=180)

Lp3

Medium
Category
MCP Least Privilege
Confidence
91% confidence
Finding
The skill documents capabilities to read/write files, access environment variables, invoke shell commands, and make network requests, but it declares no permissions. This weakens sandboxing and user transparency because a host system may permit broader actions than users expect, especially for a skill that handles local media and credentials.

Tp4

High
Category
MCP Tool Poisoning
Confidence
94% confidence
Finding
The skill claims a relatively bounded purpose, but the documentation expands behavior to remote image generation, uploading local files to external services/OSS, and runtime dependency installation. This mismatch is dangerous because users and orchestrators may invoke the skill under false assumptions, allowing unanticipated data egress and code execution paths.

Description-Behavior Mismatch

Medium
Confidence
87% confidence
Finding
The manifest suggests the skill operates on user-provided images, but the body adds automatic image generation from script text when images are missing. This matters because it silently broadens external processing and may send user content to image-generation services without a clear expectation or dedicated consent.

Context-Inappropriate Capability

Medium
Confidence
96% confidence
Finding
Automatic package installation via pip is unrelated to normal video-generation logic and introduces a supply-chain and arbitrary code execution risk during dependency installation. Because it happens implicitly at runtime, operators and users may not realize the skill is fetching and executing external package installer logic, which is dangerous in shared or production agent environments.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
Installing Python packages during execution is an unsafe supply-chain pattern and is not necessary for normal operation of a voice-enrollment utility. It expands the attack surface by allowing execution of package install hooks and by pulling code from external repositories at runtime.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The skill description says it makes static slideshow videos from user-provided images, but the code silently generates missing images online when images are absent or counts do not match. That means user script content may be transmitted to external image-generation services beyond the declared behavior, creating a privacy and consent problem.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
In portrait mode, the script similarly auto-generates follow-on images online if user-supplied images are missing or incomplete. Because this exceeds the user-facing promise of operating on provided materials, it can leak script content and create unexpected third-party processing of user data.

Intent-Code Divergence

Medium
Confidence
97% confidence
Finding
The comments claim cloned voice IDs are only used in memory and not persisted, but synthesize_segment saves the voice_id into voice_config.json. Persisting identifiers for cloned voices increases privacy risk, creates undeclared retention of sensitive biometric-related metadata, and may allow reuse across future runs without fresh consent.

Vague Triggers

Medium
Confidence
83% confidence
Finding
The trigger text is overly broad and says the skill must be used for common video-related requests. Overbroad mandatory routing increases the chance that user content, images, audio, and credentials are sent through this skill even when a lower-privilege or local-only option would suffice.

Missing User Warnings

High
Confidence
97% confidence
Finding
The skill describes uploading portrait photos and driving audio to external services and OSS to generate talking-head video, but it does not provide a prominent user-facing privacy warning or explicit consent step. Because these are highly sensitive biometric and voice-linked assets, undisclosed upload and processing materially increase privacy, compliance, and misuse risk.

Missing User Warnings

High
Confidence
98% confidence
Finding
The voice-cloning workflow instructs users to upload audio to publicly accessible or external services for enrollment, but it lacks a clear warning that biometric voice data is being transferred for remote processing. Voiceprints are sensitive personal data, and insufficient disclosure can lead to unauthorized cloning, privacy harm, and regulatory exposure.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The documentation instructs users to upload voice samples to a public URL and use a voice-cloning service, but it does not warn about privacy, consent, retention, or misuse risks for biometric voice data. Because voice samples are sensitive personal data and publicly accessible hosting expands exposure, this can lead to unauthorized access, replay, impersonation, or non-compliant data handling.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill installs packages without any user-facing notice or confirmation, causing hidden changes to the execution environment. This weakens operator control, impairs auditability, and can be abused indirectly if package sources or dependency resolution are compromised, especially in agent platforms where skills are expected to be declarative and minimally self-modifying.

Missing User Warnings

Medium
Confidence
98% confidence
Finding
The script unconditionally deletes the entire output directory with shutil.rmtree if it exists. In an agent context, a user-controlled or mistakenly broad output path could cause destructive loss of unrelated local files without confirmation, and the only safety check covers equality with the images directory, not other dangerous paths.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The helper uploads local audio files to remote OSS and is later used for voice cloning and portrait generation without a strong upfront consent gate in the code path. Because the skill processes highly sensitive inputs like voice samples and portrait photos, undisclosed remote upload materially increases privacy and exfiltration risk.

External Transmission

Medium
Category
Data Exfiltration
Content
save_url = "https://platform.delilegal.com/api/v1/file/saveFile"

    # 步骤一:获取上传临时链接
    resp1 = _requests.post(
        prepare_url,
        headers=platform_headers,
        json={"fileHash": file_hash, "fileName": file_name},
Confidence
91% confidence
Finding
requests.post( prepare_url, headers=platform_headers, json=

External Transmission

Medium
Category
Data Exfiltration
Content
raise RuntimeError(f"[步骤二] 上传到 OSS 失败({resp2.status_code}):{resp2.text[:200]}")

    # 步骤三:保存文件记录
    resp3 = _requests.post(
        save_url,
        headers=platform_headers,
        json={"fileHash": file_hash, "originalName": file_name},
Confidence
90% confidence
Finding
requests.post( save_url, headers=platform_headers, json=

Unvalidated Output Injection

High
Category
Output Handling
Content
output_path,
            ]
        )
        result = subprocess.run(cmd, capture_output=True)
        if result.returncode != 0:
            stderr = result.stderr.decode(errors='replace')
            print(f"❌ 编码失败:{stderr[-500:]}")
Confidence
82% confidence
Finding
subprocess.run(cmd, capture_output

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal