Back to skill

Security audit

Video Dubbing

Security checks across malware telemetry and agentic risk

Overview

This is mostly a coherent video dubbing skill, but it includes an under-documented Bilibili upload script that can use local account credentials to publish videos.

Review before installing. Use it only if you are comfortable sending transcript text and at least one extracted video frame to the configured model APIs. Remove or ignore scripts/upload_bilibili.py unless you intentionally want Bilibili publishing, and do not place Bilibili session credentials at the hard-coded path unless you accept that running that script can post under that account. Use trusted endpoints, scoped API keys, and only reference voices you have permission to use.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (23)

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
frame_path = str(temp_dir / "subtitle_check.jpg")
    ffmpeg = config['ffmpeg_path']
    cmd = [ffmpeg, "-y", "-ss", "30", "-i", video_path, "-vframes", "1", "-q:v", "2", frame_path]
    result = subprocess.run(cmd, capture_output=True)
    
    if not os.path.exists(frame_path):
        print("  [!] 无法提取视频帧,默认不覆盖")
Confidence
85% confidence
Finding
result = subprocess.run(cmd, capture_output=True)

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
def run_ffmpeg(cmd):
    result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')
    return result.returncode == 0
Confidence
83% confidence
Finding
result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'cmd' from os.environ.get (line 512, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
"-vf", vf, "-map", "0:v", "-map", "1:a",
           "-c:v", codec, "-preset", "default", "-c:a", "aac", output_path]
    
    result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')
    if result.returncode == 0:
        print(f"[OK] {output_path}")
        return True
Confidence
88% confidence
Finding
result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'cmd_simple' from os.environ.get (line 522, credential/environment) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
"-vf", f"subtitles='{srt_escaped}':force_style='{style_str}'",
                  "-map", "0:v", "-map", "1:a",
                  "-c:v", codec, "-preset", "default", "-c:a", "aac", output_path]
    result2 = subprocess.run(cmd_simple, capture_output=True, text=True, encoding='utf-8', errors='replace')
    return result2.returncode == 0
Confidence
88% confidence
Finding
result2 = subprocess.run(cmd_simple, capture_output=True, text=True, encoding='utf-8', errors='replace')

Tainted flow: 'vision_url' from os.environ.get (line 122, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
vision_model = config.get('vision', {}).get('model', 'Qwen/Qwen2.5-VL-72B-Instruct')
    
    try:
        resp = requests.post(
            vision_url,
            headers={"Authorization": f"Bearer {translate_key}", "Content-Type": "application/json"},
            json={
Confidence
98% confidence
Finding
resp = requests.post( vision_url, headers={"Authorization": f"Bearer {translate_key}", "Content-Type": "application/json"}, json={ "model":

Tainted flow: 'api_url' from os.environ.get (line 214, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
}
            
            try:
                resp = requests.post(api_url, headers=headers, json=data, timeout=60)
                result = resp.json()
                
                if 'choices' in result:
Confidence
98% confidence
Finding
resp = requests.post(api_url, headers=headers, json=data, timeout=60)

Tainted flow: 'api_url' from os.environ.get (line 214, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
for text in batch:
                for _ in range(3):
                    try:
                        resp = requests.post(
                            api_url,
                            headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
                            json={
Confidence
98% confidence
Finding
resp = requests.post( api_url, headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},

Lp3

Medium
Category
MCP Least Privilege
Confidence
92% confidence
Finding
The skill declares no permissions, yet its documentation clearly indicates capabilities involving environment variables, file I/O, network access, and shell execution. This is dangerous because operators and users cannot accurately assess the trust boundary or consent to the actual access the skill requires, increasing the chance of unintended data exposure or command execution.

Tp4

High
Category
MCP Tool Poisoning
Confidence
95% confidence
Finding
The skill is presented as a dubbing/localization tool, but the documented file structure includes a Bilibili upload script and the static analysis indicates handling of local platform credentials and third-party publishing. Hidden or under-disclosed publishing behavior is dangerous because a user may authorize media processing without realizing the skill can access account credentials and publish content externally.

Description-Behavior Mismatch

Medium
Confidence
88% confidence
Finding
Including an upload_bilibili.py script in a skill marketed as a dubbing tool expands the operational scope beyond what users are told to expect. Even without proof of automatic execution, bundling account-facing upload functionality increases the risk of accidental or unauthorized publication if invoked by the agent or a user who did not understand the full scope.

Description-Behavior Mismatch

Low
Confidence
82% confidence
Finding
The documentation claims local operation and data safety, but other sections explicitly send content to external translation and vision APIs. Misrepresenting remote processing is dangerous because users may expose video frames, audio, transcripts, or subtitles to third parties without informed consent, even if the transfer itself is part of normal functionality.

Intent-Code Divergence

Low
Confidence
84% confidence
Finding
The skill states it runs locally with safe data handling while also documenting remote API calls, creating a misleading security posture. This can cause users to trust the tool with sensitive media under false assumptions about confidentiality and processing location.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The skill exports a video frame to an external vision API for subtitle detection, but that network capability is not evident from the declared skill purpose. Hidden or under-disclosed data egress is dangerous in agent ecosystems because users may expect local media processing while their content is actually uploaded off-box.

Intent-Code Divergence

Low
Confidence
84% confidence
Finding
The configuration comments imply endpoint and credential flexibility, but the implementation reuses the translation key for vision requests and lacks a distinct vision credential path. This increases accidental over-sharing of privileges and makes credential scope broader than necessary.

Context-Inappropriate Capability

High
Confidence
97% confidence
Finding
The script loads persistent Bilibili account credentials from a local file and uses them to upload content to a third-party platform, which exceeds the stated dubbing/translation purpose. In an agent skill context, this creates unauthorized external-action capability and could publish user-derived or attacker-chosen content without clear user consent or account scoping.

Intent-Code Divergence

Medium
Confidence
90% confidence
Finding
The file is explicitly a Bilibili upload script, which materially conflicts with the skill's declared purpose as a Chinese dubbing tool. In a skill package, this mismatch is dangerous because it conceals outbound publishing behavior that users and reviewers may not expect, increasing the risk of covert account actions or unwanted content distribution.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The documentation does not clearly warn users that content may be sent to external APIs during processing. For a media-processing skill, this is significant because uploaded video/audio-derived data can contain personal, proprietary, or copyrighted material, and users need notice before such transmission occurs.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The README instructs users to supply a reference voice recording and matching transcript for TTS voice cloning, but it does not warn about consent, impersonation, or privacy risks. In the context of a Chinese dubbing skill built around reference-audio-driven synthesis, this omission can facilitate unauthorized use of another person's voice or exposure of sensitive biometric data.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The code uploads video frames and transcribed content to remote AI services without any visible user warning, consent gate, or disclosure. For media-processing skills, that omission materially increases privacy risk because the uploaded content may contain sensitive personal, corporate, or copyrighted information.

External Transmission

Medium
Category
Data Exfiltration
Content
endpoints:
  - name: TRANSLATE_API_URL
    description: 翻译API端点
    default: https://api.siliconflow.cn/v1/chat/completions
  - name: VISION_API_URL
    description: 硬字幕检测API端点(支持Vision模型)
    default: https://api.siliconflow.cn/v1/chat/completions
Confidence
87% confidence
Finding
https://api.siliconflow.cn/

External Transmission

Medium
Category
Data Exfiltration
Content
default: https://api.siliconflow.cn/v1/chat/completions
  - name: VISION_API_URL
    description: 硬字幕检测API端点(支持Vision模型)
    default: https://api.siliconflow.cn/v1/chat/completions
paths:
  - name: WORK_DIR
    description: 工作目录
Confidence
87% confidence
Finding
https://api.siliconflow.cn/

External Transmission

Medium
Category
Data Exfiltration
Content
"voxcpm_dir": "./VoxCPM",
  "ffmpeg_path": "ffmpeg",
  "translate": {
    "api_url": "https://api.siliconflow.cn/v1/chat/completions",
    "api_key": "YOUR_API_KEY",
    "model": "tencent/Hunyuan-MT-7B"
  },
Confidence
86% confidence
Finding
https://api.siliconflow.cn/

External Transmission

Medium
Category
Data Exfiltration
Content
"model": "tencent/Hunyuan-MT-7B"
  },
  "vision": {
    "api_url": "https://api.siliconflow.cn/v1/chat/completions",
    "model": "Qwen/Qwen2.5-VL-72B-Instruct"
  },
  "tts": {
Confidence
86% confidence
Finding
https://api.siliconflow.cn/

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.