Short Video Parser

Security checks across malware telemetry and agentic risk

Overview

The skill mostly does what it says, but it sends video-derived audio to a hardcoded external transcription API and accepts broadly supplied URLs with limited network scoping.

Install only if you are comfortable sending extracted audio from processed videos to SiliconFlow and allowing the skill to fetch video URLs from supported platforms. Avoid using it on private, confidential, or copyrighted media without authorization, enable auto_cleanup if you do not want local video/audio files retained, and do not paste account cookies into source code.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (11)

Tainted flow: 'files' from open (line 188, file read) → requests.post (network output)

High
Category
Data Flow
Content
data = {
            "model": model
        }
        response = requests.post(DEFAULT_API_BASE_URL, headers=headers, files=files, data=data, timeout=600)
        response.raise_for_status()

        result = response.json()
Confidence
93% confidence
Finding
response = requests.post(DEFAULT_API_BASE_URL, headers=headers, files=files, data=data, timeout=600)

Lp3

Medium
Category
MCP Least Privilege
Confidence
89% confidence
Finding
The skill declares no explicit permissions, yet its documented behavior includes network access, local file reads/writes, and invoking shell-accessible binaries like ffmpeg. This creates a transparency and policy-enforcement gap: users or hosting platforms cannot accurately scope or restrict what the skill can do before execution.

Description-Behavior Mismatch

Medium
Confidence
96% confidence
Finding
The skill re-exports `load_env`, giving callers a generic environment-file reading capability that is broader than the declared video parsing/transcription purpose. In an agent skill context, this unnecessarily expands the attack surface and can expose secrets from arbitrary `.env` files if another component invokes it with a sensitive path.

Context-Inappropriate Capability

Medium
Confidence
98% confidence
Finding
`load_env(env_path)` reads any file path supplied by the caller and parses key/value pairs, which is effectively an arbitrary local file read primitive aimed at secret-bearing files. Because `.env` files commonly contain API keys, tokens, and database credentials, this creates a direct credential exposure risk unrelated to the skill's stated user-facing function.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The README documents uploading video/audio content to an external transcription provider but does not clearly disclose the privacy and data-handling implications to users. This can lead operators to unknowingly transmit potentially sensitive media, metadata, or personal information to a third party, creating compliance and privacy risk.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The README documents a workflow that downloads/parses third-party video content and uses an external SiliconFlow transcription service, but it does not clearly disclose that user-supplied URLs, media, metadata, and possibly extracted audio may be transmitted to external endpoints. In a tool explicitly designed for bulk parsing and transcription across many platforms, that omission can lead users to unknowingly process sensitive or copyrighted content through third-party services, creating privacy, compliance, and data-handling risk.

Missing User Warnings

Medium
Confidence
87% confidence
Finding
The comments explicitly encourage operators to paste live Bilibili account cookies into source code to obtain higher-quality video access, but provide no warning that these values are sensitive session credentials. If reused or committed, those cookies could enable account hijacking or unauthorized access to the operator's Bilibili account.

Missing User Warnings

Low
Confidence
87% confidence
Finding
The parser performs an outbound HTTP request to a caller-controlled `share_url` with no validation of domain, scheme, or destination. In a skill designed to ingest arbitrary short-video links, this can enable SSRF-style behavior against internal services or unexpected network access if an attacker supplies crafted URLs instead of legitimate Meipai links.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
Audio extracted from videos is sent to an external transcription provider without any explicit warning, consent gate, or privacy notice in the code path. Since this skill processes user-supplied media, the lack of transparent disclosure increases the risk of unauthorized sharing of sensitive content.

External Transmission

Medium
Category
Data Exfiltration
Content
# 示例: parse_api_url=http://ip:8000/video/share/url/parse?url=
parse_api_url=

# SiliconFlow ASR API 地址 (可选,默认: https://api.siliconflow.cn/v1/audio/transcriptions)
siliconflow_api_url=https://api.siliconflow.cn/v1/audio/transcriptions

# 是否自动清理临时文件 (可选,默认: false)
Confidence
80% confidence
Finding
https://api.siliconflow.cn/

External Transmission

Medium
Category
Data Exfiltration
Content
parse_api_url=

# SiliconFlow ASR API 地址 (可选,默认: https://api.siliconflow.cn/v1/audio/transcriptions)
siliconflow_api_url=https://api.siliconflow.cn/v1/audio/transcriptions

# 是否自动清理临时文件 (可选,默认: false)
# true: 自动删除临时文件(视频和音频)
Confidence
91% confidence
Finding
https://api.siliconflow.cn/

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal