Short Video Parser

Security checks across malware telemetry and agentic risk

Overview

The skill mostly does what it says, but it sends video-derived audio to a hardcoded external transcription API and accepts broadly supplied URLs with limited network scoping.

Install only if you are comfortable sending extracted audio from processed videos to SiliconFlow and allowing the skill to fetch video URLs from supported platforms. Avoid using it on private, confidential, or copyrighted media without authorization, enable auto_cleanup if you do not want local video/audio files retained, and do not paste account cookies into source code.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (11)

Tainted flow: 'files' from open (line 188, file read) → requests.post (network output)

High

Category: Data Flow
Content: data = { "model": model } response = requests.post(DEFAULT_API_BASE_URL, headers=headers, files=files, data=data, timeout=600) response.raise_for_status() result = response.json()
Confidence: 93% confidence
Finding: response = requests.post(DEFAULT_API_BASE_URL, headers=headers, files=files, data=data, timeout=600)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 89% confidence
Finding: The skill declares no explicit permissions, yet its documented behavior includes network access, local file reads/writes, and invoking shell-accessible binaries like ffmpeg. This creates a transparency and policy-enforcement gap: users or hosting platforms cannot accurately scope or restrict what the skill can do before execution.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The skill re-exports `load_env`, giving callers a generic environment-file reading capability that is broader than the declared video parsing/transcription purpose. In an agent skill context, this unnecessarily expands the attack surface and can expose secrets from arbitrary `.env` files if another component invokes it with a sensitive path.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: `load_env(env_path)` reads any file path supplied by the caller and parses key/value pairs, which is effectively an arbitrary local file read primitive aimed at secret-bearing files. Because `.env` files commonly contain API keys, tokens, and database credentials, this creates a direct credential exposure risk unrelated to the skill's stated user-facing function.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README documents uploading video/audio content to an external transcription provider but does not clearly disclose the privacy and data-handling implications to users. This can lead operators to unknowingly transmit potentially sensitive media, metadata, or personal information to a third party, creating compliance and privacy risk.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The README documents a workflow that downloads/parses third-party video content and uses an external SiliconFlow transcription service, but it does not clearly disclose that user-supplied URLs, media, metadata, and possibly extracted audio may be transmitted to external endpoints. In a tool explicitly designed for bulk parsing and transcription across many platforms, that omission can lead users to unknowingly process sensitive or copyrighted content through third-party services, creating privacy, compliance, and data-handling risk.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The comments explicitly encourage operators to paste live Bilibili account cookies into source code to obtain higher-quality video access, but provide no warning that these values are sensitive session credentials. If reused or committed, those cookies could enable account hijacking or unauthorized access to the operator's Bilibili account.

Missing User Warnings

Low

Confidence: 87% confidence
Finding: The parser performs an outbound HTTP request to a caller-controlled `share_url` with no validation of domain, scheme, or destination. In a skill designed to ingest arbitrary short-video links, this can enable SSRF-style behavior against internal services or unexpected network access if an attacker supplies crafted URLs instead of legitimate Meipai links.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Audio extracted from videos is sent to an external transcription provider without any explicit warning, consent gate, or privacy notice in the code path. Since this skill processes user-supplied media, the lack of transparent disclosure increases the risk of unauthorized sharing of sensitive content.

External Transmission

Medium

Category: Data Exfiltration
Content: # 示例: parse_api_url=http://ip:8000/video/share/url/parse?url= parse_api_url= # SiliconFlow ASR API 地址 (可选，默认: https://api.siliconflow.cn/v1/audio/transcriptions) siliconflow_api_url=https://api.siliconflow.cn/v1/audio/transcriptions # 是否自动清理临时文件 (可选，默认: false)
Confidence: 80% confidence
Finding: https://api.siliconflow.cn/

External Transmission

Medium

Category: Data Exfiltration
Content: parse_api_url= # SiliconFlow ASR API 地址 (可选，默认: https://api.siliconflow.cn/v1/audio/transcriptions) siliconflow_api_url=https://api.siliconflow.cn/v1/audio/transcriptions # 是否自动清理临时文件 (可选，默认: false) # true: 自动删除临时文件（视频和音频）
Confidence: 91% confidence
Finding: https://api.siliconflow.cn/

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal