豆包全家桶（视频图片生成）

Security checks across malware telemetry and agentic risk

Overview

This skill appears to do what it claims: generate Doubao/Volcengine images and videos, save results locally, and optionally manage video tasks/webhooks.

Install only if you are comfortable sending prompts and any supplied images or URLs to Volcengine Ark. Set OUTPUT_ROOT to a directory you control, review or clean outputs/logs if prompts or media are sensitive, and run the webhook server only when you need async callbacks.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (8)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill declares no permissions while its documented behavior requires environment access, local file reads/writes, and network operations. This is dangerous because operators and policy systems may not understand the actual capability surface, reducing informed consent and weakening sandboxing or review controls for a skill that can write files, access API keys, and contact external services.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The description says the skill generates images or videos and saves results locally, but the instructions also expose broader capabilities: task querying/listing/deletion, downloading from arbitrary URLs, starting a webhook HTTP server, and persisting callback data in SQLite. This mismatch is dangerous because users may invoke a seemingly simple media-generation skill without realizing it can open local services, store task metadata long-term, and fetch remote content beyond the core advertised purpose.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The helper exposes list and delete operations for video-generation tasks even though the stated skill purpose is generating media and saving results locally. In a shared utility module, these extra management capabilities expand the skill's authority and could let downstream code enumerate metadata about prior tasks or cancel/delete tasks without clear user intent, violating least privilege.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The script can enable external web search via the `--web-search` flag, but the skill description only presents it as a local image/video generation tool. This creates a capability mismatch: prompts or related data may be sent to an additional network-backed search tool without the user's informed expectation, increasing privacy, data-governance, and prompt-leakage risk.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The skill documentation notes output paths later, but it does not present a clear upfront warning that generated media, callback data, and batch-mode copies will be written to local disk and possibly into project directories. This can lead to unintentional persistence of sensitive or copyrighted content, especially in automated workflows where users may not expect files to be retained or copied.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: log_params serializes arbitrary keyword arguments and writes them to persistent log files under OUTPUT_ROOT without any filtering or disclosure. If prompts, URLs, local paths, API responses, or other sensitive user-supplied data are passed in, the skill may silently retain private data on disk and increase exposure through log access or later collection.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The script accepts image paths or URLs and resolves them before sending content to a third-party image-generation service, but provides no warning, validation, or restriction around remote/network sources. This can lead to unintended transmission of sensitive internal URLs or private images to external services, and may also enable SSRF-like behavior depending on how resolve_image_source is implemented.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The webhook token is a bearer secret used to authorize callback submission, and printing it to stdout increases the chance it is exposed via terminal history, shell logging, process supervisors, CI logs, or shared console sessions. Although the app restricts requests to localhost, local disclosure still matters because any local user, container sidecar, or log consumer that sees the token could forge callback requests and tamper with task state.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal