Douyin Scraper

Security checks across malware telemetry and agentic risk

Overview

This appears to be a Douyin scraping helper, but it needs review because it can trigger browser scraping from broad phrases, changes the local environment during setup, and some advertised outputs may be mock data.

Install only if you expect a scraping tool that opens a browser, reaches Douyin, installs browser dependencies, and may write result files. Prefer explicit commands over broad natural-language triggering, avoid using a personal logged-in Douyin session unless you understand the risks, and treat results as unverified or mock unless the output clearly proves they came from live page data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (6)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill instructs the agent to invoke local shell commands and references file-writing outputs, but the manifest does not declare corresponding permissions. This creates a capability/visibility gap: a host may load the skill assuming low privilege while the documented workflow actually performs command execution, environment use, and output file creation.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The README explicitly advertises very broad natural-language trigger phrases such as “搜索一下xxx”, “帮我搜xxx”, and “看看xxx相关内容”, which can overlap with ordinary conversation and cause the agent to invoke the scraping skill unexpectedly. Because this skill performs external network access and can collect/save third-party platform data, accidental invocation can lead to unintended browsing, scraping, and local file creation without sufficiently explicit user consent.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill description states that it uses Playwright to scrape Douyin content and output JSON/CSV/Markdown, but the README does not prominently warn that invocation causes outbound network requests and may write collected data to disk. In an agent environment, missing disclosure of these side effects increases the chance of users triggering web automation and persistence behavior they did not intend or fully understand.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The trigger phrases are very broad and overlap with normal user conversation such as '帮我找…', '看看…相关内容', or mention of '抖音热榜/爆款'. In an agentic system, this can cause accidental or context-inappropriate activation of a scraping skill, leading to unexpected browser automation, network access, and data collection.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The documentation explicitly says a single natural-language sentence should trigger keyword extraction and direct execution of `scripts/scraper.py nl`, with the user's original text passed through. This removes activation boundaries and encourages immediate tool execution from ambiguous input, increasing the risk of unintended scraping, shell invocation, and downstream side effects such as file output or authenticated browsing against Douyin.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The example triggers are broad everyday Chinese phrases like “搜索一下…”, “帮我找…”, and “看看…相关内容”, which can cause the skill to activate from ordinary conversation rather than an explicit user request to scrape Douyin. In an agent setting, this overbroad matching increases the chance of unintended browsing/scraping actions and surprise data collection or policy violations.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal