Security audit

Video Dubbing

Security checks across malware telemetry and agentic risk

Overview

This is mostly a coherent video dubbing skill, but it includes an under-documented Bilibili upload script that can use local account credentials to publish videos.

Review before installing. Use it only if you are comfortable sending transcript text and at least one extracted video frame to the configured model APIs. Remove or ignore scripts/upload_bilibili.py unless you intentionally want Bilibili publishing, and do not place Bilibili session credentials at the hard-coded path unless you accept that running that script can post under that account. Use trusted endpoints, scoped API keys, and only reference voices you have permission to use.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (23)

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: frame_path = str(temp_dir / "subtitle_check.jpg") ffmpeg = config['ffmpeg_path'] cmd = [ffmpeg, "-y", "-ss", "30", "-i", video_path, "-vframes", "1", "-q:v", "2", frame_path] result = subprocess.run(cmd, capture_output=True) if not os.path.exists(frame_path): print(" [!] 无法提取视频帧，默认不覆盖")
Confidence: 85% confidence
Finding: result = subprocess.run(cmd, capture_output=True)

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: def run_ffmpeg(cmd): result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace') return result.returncode == 0
Confidence: 83% confidence
Finding: result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: "-vf", vf, "-map", "0:v", "-map", "1:a", "-c:v", codec, "-preset", "default", "-c:a", "aac", output_path] result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace') if result.returncode == 0: print(f"[OK] {output_path}") return True
Confidence: 88% confidence
Finding: result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'cmd_simple' from os.environ.get (line 522, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: "-vf", f"subtitles='{srt_escaped}':force_style='{style_str}'", "-map", "0:v", "-map", "1:a", "-c:v", codec, "-preset", "default", "-c:a", "aac", output_path] result2 = subprocess.run(cmd_simple, capture_output=True, text=True, encoding='utf-8', errors='replace') return result2.returncode == 0
Confidence: 88% confidence
Finding: result2 = subprocess.run(cmd_simple, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'vision_url' from os.environ.get (line 122, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: vision_model = config.get('vision', {}).get('model', 'Qwen/Qwen2.5-VL-72B-Instruct') try: resp = requests.post( vision_url, headers={"Authorization": f"Bearer {translate_key}", "Content-Type": "application/json"}, json={
Confidence: 98% confidence
Finding: resp = requests.post( vision_url, headers={"Authorization": f"Bearer {translate_key}", "Content-Type": "application/json"}, json={ "model":

Tainted flow: 'api_url' from os.environ.get (line 214, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: } try: resp = requests.post(api_url, headers=headers, json=data, timeout=60) result = resp.json() if 'choices' in result:
Confidence: 98% confidence
Finding: resp = requests.post(api_url, headers=headers, json=data, timeout=60)

Tainted flow: 'api_url' from os.environ.get (line 214, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: for text in batch: for _ in range(3): try: resp = requests.post( api_url, headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, json={
Confidence: 98% confidence
Finding: resp = requests.post( api_url, headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill declares no permissions, yet its documentation clearly indicates capabilities involving environment variables, file I/O, network access, and shell execution. This is dangerous because operators and users cannot accurately assess the trust boundary or consent to the actual access the skill requires, increasing the chance of unintended data exposure or command execution.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The skill is presented as a dubbing/localization tool, but the documented file structure includes a Bilibili upload script and the static analysis indicates handling of local platform credentials and third-party publishing. Hidden or under-disclosed publishing behavior is dangerous because a user may authorize media processing without realizing the skill can access account credentials and publish content externally.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: Including an upload_bilibili.py script in a skill marketed as a dubbing tool expands the operational scope beyond what users are told to expect. Even without proof of automatic execution, bundling account-facing upload functionality increases the risk of accidental or unauthorized publication if invoked by the agent or a user who did not understand the full scope.

Description-Behavior Mismatch

Low

Confidence: 82% confidence
Finding: The documentation claims local operation and data safety, but other sections explicitly send content to external translation and vision APIs. Misrepresenting remote processing is dangerous because users may expose video frames, audio, transcripts, or subtitles to third parties without informed consent, even if the transfer itself is part of normal functionality.

Intent-Code Divergence

Low

Confidence: 84% confidence
Finding: The skill states it runs locally with safe data handling while also documenting remote API calls, creating a misleading security posture. This can cause users to trust the tool with sensitive media under false assumptions about confidentiality and processing location.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The skill exports a video frame to an external vision API for subtitle detection, but that network capability is not evident from the declared skill purpose. Hidden or under-disclosed data egress is dangerous in agent ecosystems because users may expect local media processing while their content is actually uploaded off-box.

Intent-Code Divergence

Low

Confidence: 84% confidence
Finding: The configuration comments imply endpoint and credential flexibility, but the implementation reuses the translation key for vision requests and lacks a distinct vision credential path. This increases accidental over-sharing of privileges and makes credential scope broader than necessary.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The script loads persistent Bilibili account credentials from a local file and uses them to upload content to a third-party platform, which exceeds the stated dubbing/translation purpose. In an agent skill context, this creates unauthorized external-action capability and could publish user-derived or attacker-chosen content without clear user consent or account scoping.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The file is explicitly a Bilibili upload script, which materially conflicts with the skill's declared purpose as a Chinese dubbing tool. In a skill package, this mismatch is dangerous because it conceals outbound publishing behavior that users and reviewers may not expect, increasing the risk of covert account actions or unwanted content distribution.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documentation does not clearly warn users that content may be sent to external APIs during processing. For a media-processing skill, this is significant because uploaded video/audio-derived data can contain personal, proprietary, or copyrighted material, and users need notice before such transmission occurs.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README instructs users to supply a reference voice recording and matching transcript for TTS voice cloning, but it does not warn about consent, impersonation, or privacy risks. In the context of a Chinese dubbing skill built around reference-audio-driven synthesis, this omission can facilitate unauthorized use of another person's voice or exposure of sensitive biometric data.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The code uploads video frames and transcribed content to remote AI services without any visible user warning, consent gate, or disclosure. For media-processing skills, that omission materially increases privacy risk because the uploaded content may contain sensitive personal, corporate, or copyrighted information.

External Transmission

Medium

Category: Data Exfiltration
Content: endpoints: - name: TRANSLATE_API_URL description: 翻译API端点 default: https://api.siliconflow.cn/v1/chat/completions - name: VISION_API_URL description: 硬字幕检测API端点（支持Vision模型） default: https://api.siliconflow.cn/v1/chat/completions
Confidence: 87% confidence
Finding: https://api.siliconflow.cn/

External Transmission

Medium

Category: Data Exfiltration
Content: default: https://api.siliconflow.cn/v1/chat/completions - name: VISION_API_URL description: 硬字幕检测API端点（支持Vision模型） default: https://api.siliconflow.cn/v1/chat/completions paths: - name: WORK_DIR description: 工作目录
Confidence: 87% confidence
Finding: https://api.siliconflow.cn/

External Transmission

Medium

Category: Data Exfiltration
Content: "voxcpm_dir": "./VoxCPM", "ffmpeg_path": "ffmpeg", "translate": { "api_url": "https://api.siliconflow.cn/v1/chat/completions", "api_key": "YOUR_API_KEY", "model": "tencent/Hunyuan-MT-7B" },
Confidence: 86% confidence
Finding: https://api.siliconflow.cn/

External Transmission

Medium

Category: Data Exfiltration
Content: "model": "tencent/Hunyuan-MT-7B" }, "vision": { "api_url": "https://api.siliconflow.cn/v1/chat/completions", "model": "Qwen/Qwen2.5-VL-72B-Instruct" }, "tts": {
Confidence: 86% confidence
Finding: https://api.siliconflow.cn/

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.