douyin-analyse-batch

Security checks across malware telemetry and agentic risk

Overview

This skill should be reviewed because it sets up recurring email automation but also contains hard-coded recipients and bundled download, proxy, Telegram, MCP, and transcription features that are broader than the advertised daily report.

Install only after reviewing and changing all recipient settings, cron entries, SMTP credentials, and API keys. Treat the bundled downloader, transcription, MCP server, WebUI proxy, and Telegram artifacts as additional capabilities, not just dependencies for a simple daily email report.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (65)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: docx_path = None try: # 用 venv Python 运行 md_to_docx（依赖 python-docx） p = subprocess.run( [VENV_PY, str(SCRIPT_DIR / 'helpers' / 'md_to_docx.py'), str(file_path)], capture_output=True, text=True, timeout=60 )
Confidence: 88% confidence
Finding: p = subprocess.run( [VENV_PY, str(SCRIPT_DIR / 'helpers' / 'md_to_docx.py'), str(file_path)], capture_output=True, text=True, timeout=60 )

Tainted flow: 'video_info' from os.getenv (line 359, credential/environment) → requests.get (network output)

Critical

Category: Data Flow
Content: if show_progress: print(f"正在下载视频: {video_info['title']}") response = requests.get(video_info['url'], headers=HEADERS, stream=True) response.raise_for_status() # 获取文件大小
Confidence: 88% confidence
Finding: response = requests.get(video_info['url'], headers=HEADERS, stream=True)

Tainted flow: 'files' from open (line 251, file read) → requests.post (network output)

High

Category: Data Flow
Content: } try: response = requests.post(self.api_base_url, files=files, headers=headers) response.raise_for_status() result = response.json()
Confidence: 97% confidence
Finding: response = requests.post(self.api_base_url, files=files, headers=headers)

Tainted flow: 'share_url' from requests.get (line 69, network input) → requests.get (network output)

Medium

Category: Data Flow
Content: raise ValueError("未找到有效的分享链接") share_url = urls[0] share_response = requests.get(share_url, headers=HEADERS) video_id = share_response.url.split("?")[0].strip("/").split("/")[-1] share_url = f'https://www.iesdouyin.com/share/video/{video_id}'
Confidence: 88% confidence
Finding: share_response = requests.get(share_url, headers=HEADERS)

Tainted flow: 'video_info' from requests.get (line 314, network input) → requests.get (network output)

Medium

Category: Data Flow
Content: ctx.info(f"正在下载视频: {video_info['title']}") response = requests.get(video_info['url'], headers=HEADERS, stream=True) response.raise_for_status() # 获取文件大小
Confidence: 83% confidence
Finding: response = requests.get(video_info['url'], headers=HEADERS, stream=True)

Tainted flow: 'cmd' from os.environ.get (line 26, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: def run(cmd, capture=True, check=True): if isinstance(cmd, str): cmd = cmd.split() p = subprocess.run(cmd, capture_output=capture, text=True) if check and p.returncode != 0: raise RuntimeError(p.stderr.strip() or f"command failed: {cmd}") return p.stdout.strip() if capture else ""
Confidence: 87% confidence
Finding: p = subprocess.run(cmd, capture_output=capture, text=True)

Tainted flow: 'cmd' from os.environ.get (line 26, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: '-o', str(out_path.with_suffix('.%(ext)s')), url ] p = subprocess.run(cmd, capture_output=True, text=True, timeout=120) if p.returncode == 0: downloaded = list(out_path.parent.glob(f"{out_path.stem}.*")) if downloaded:
Confidence: 84% confidence
Finding: p = subprocess.run(cmd, capture_output=True, text=True, timeout=120)

Tainted flow: 'VENV_PY' from os.environ.get (line 18, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: print(text) _run() """ p = subprocess.run([VENV_PY, '-c', code], capture_output=True, text=True, timeout=300) if p.returncode != 0: raise RuntimeError(p.stderr.strip() or 'transcription failed') return p.stdout.strip()
Confidence: 89% confidence
Finding: p = subprocess.run([VENV_PY, '-c', code], capture_output=True, text=True, timeout=300)

Tainted flow: 'VENV_PY' from os.environ.get (line 16, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: print(text) _run() """ p = subprocess.run([VENV_PY, '-c', code], capture_output=True, text=True, timeout=600) if p.returncode != 0: raise RuntimeError(p.stderr.strip() or p.stdout.strip() or 'transcription failed') return p.stdout.strip()
Confidence: 90% confidence
Finding: p = subprocess.run([VENV_PY, '-c', code], capture_output=True, text=True, timeout=600)

Tainted flow: 'VENV_PY' from os.environ.get (line 23, credential/environment) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: docx_path = None try: # 用 venv Python 运行 md_to_docx（依赖 python-docx） p = subprocess.run( [VENV_PY, str(SCRIPT_DIR / 'helpers' / 'md_to_docx.py'), str(file_path)], capture_output=True, text=True, timeout=60 )
Confidence: 98% confidence
Finding: p = subprocess.run( [VENV_PY, str(SCRIPT_DIR / 'helpers' / 'md_to_docx.py'), str(file_path)], capture_output=True, text=True, timeout=60 )

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill documentation describes capabilities including shell execution, file read/write, network access, environment-variable handling, cron installation, and email delivery, but no explicit permissions are declared. This creates a transparency and consent problem: users may install or invoke a skill with materially broader powers than they were informed about, increasing the chance of unintended system changes or data exposure.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The documented purpose is a Douyin daily report and email sender, but the referenced dependency set appears to include materially broader behaviors such as Telegram pushing, local JSON/TXT exports, video parsing/downloading, transcription, MCP server exposure, and a FastAPI WebUI/proxy. This mismatch is dangerous because users may authorize a seemingly narrow reporting skill that actually enables data exfiltration channels, remote interfaces, and media-processing features well beyond the stated use case.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The installation flow runs a host-level setup script that creates environments, writes configuration, and modifies cron, which goes beyond simple report generation. Documentation that encourages broad system management through a single install command increases the risk of persistent changes, difficult rollback, and abuse if the script is altered or misunderstood.

Context-Inappropriate Capability

Low

Confidence: 80% confidence
Finding: The skill includes cron-based persistence for twice-daily execution, which is a host-level capability rather than a simple on-demand analysis action. While plausibly related to a 'daily report' feature, persistence still increases attack surface because the skill continues to run and access credentials/network without an active user request.

Description-Behavior Mismatch

High

Confidence: 93% confidence
Finding: The skill metadata promises email-based daily report generation and delivery, but this file is clearly built around Telegram output and even embeds a specific Telegram chat_id. That mismatch is dangerous because users and operators may authorize or deploy the skill expecting email handling while data is actually prepared for a different channel, creating a risk of unintended disclosure, misrouting, and deceptive behavior in an automation context.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The JSON includes Telegram-specific routing metadata (`chat_id`, `channel`) and a fully formatted Telegram message even though the skill description emphasizes Word report generation and email delivery. This creates an unintended cross-channel data exposure risk: downstream components or logs may reveal recipient identifiers or send content to Telegram paths that users did not request or consent to.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The bundled dependency advertises capabilities far beyond the parent skill’s declared purpose, including general short-video downloading and transcription. This scope mismatch increases attack surface and creates a supply-chain trust problem: users may install a skill for benign reporting while inheriting unrelated media-processing features that can later be invoked or abused.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The README explicitly documents watermark-free video downloading, which is unrelated to generating and emailing daily Douyin trend reports. In this skill context, that capability is more dangerous because it enables bulk content acquisition outside the declared use case, raising legal, policy, and misuse risks while expanding what the installed skill can do.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The embedded downloader/transcriber functionality exceeds the stated purpose of generating a Douyin daily hot-list report and email summary. This hidden scope expansion increases attack surface, introduces unexpected network and file-processing behavior, and makes data handling less transparent to users and reviewers.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code provides general-purpose arbitrary Douyin video download capability, which is broader than necessary for hot-list report generation. Extra capability increases misuse potential, licensing/compliance risk, and exposure to untrusted media parsing and storage paths without clear necessity.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The script uses environment-backed API credentials to send audio to a third-party speech-to-text service outside the declared skill scope. In this context, that is more dangerous because users expecting report generation may not realize their media-derived content is being transmitted externally and billed against their credentials.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: This dependency provides watermark-free video downloading and transcript extraction, which is materially broader than the parent skill's stated purpose of generating daily hot-list reports and sending email summaries. Hidden or unnecessary capabilities expand attack surface and can surprise users by enabling content acquisition and processing that was not disclosed.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The server exposes a tool specifically for obtaining watermark-free download links, a capability unrelated to routine report generation and email automation. Such undisclosed functionality can be abused for unauthorized media retrieval and increases legal, policy, and trust risk for anyone installing the skill.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The `/api/video/download` endpoint accepts an arbitrary `url` parameter and server-side fetches it with `requests.get(...)`, then streams the response back to the caller. This is effectively an open fetch/proxy primitive and can be abused for SSRF, access to internal services, and bandwidth/resource abuse; in the context of a Douyin report/transcript UI, this capability is broader than necessary and therefore more suspicious and dangerous.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: This endpoint provides arbitrary external HTTP proxying unrelated to the stated purpose of generating Douyin daily reports and transcript extraction. Because any caller can cause the server to request attacker-chosen URLs and relay the response, it materially expands the attack surface and can be used for SSRF, network pivoting, and unauthorized retrieval of resources reachable from the server but not from the attacker.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal