抖音全站数据采集

Security checks across malware telemetry and agentic risk

Overview

This skill is not clearly malware, but it handles sensitive Douyin session cookies and exposes broad scraping/protocol tools that should be reviewed before use.

Install only if you trust MaxHub/aconfig.cn with Douyin-related queries and any credentials you provide. Avoid using primary-account cookies; use a separate test account, rotate cookies after use, protect the MAXHUB_API_KEY, and treat protocol/signature/device-registration and high-quality video extraction endpoints as high-risk features that may violate platform terms or content rights if misused.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (22)

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The skill explicitly claims it performs only read-only queries, but elsewhere documents non-read-only capabilities such as app interaction triggers and protocol utilities. This mismatch can mislead downstream agents, reviewers, or users into granting trust to capabilities they would otherwise scrutinize more carefully.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The top-level description frames the skill as a data-query assistant, while the body also includes cookie handling, signature/token/fingerprint generation, and device registration. That broader capability set materially changes the risk profile and may cause users or systems to under-classify the skill's sensitivity.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: Device registration and signature/fingerprint/token generation are high-risk protocol capabilities that can support account automation, session emulation, or evasive request construction. In the context of a Douyin data assistant, these functions are out of scope for ordinary analytics and increase the chance of misuse or policy circumvention.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The documented endpoint generates a deep link that can open the Douyin app to a search page, which is a non-read-only side effect inside a file otherwise framed as search/data-query APIs. In an agent context, this can enable unexpected client-side actions or UI redirection if tools are invoked automatically from user-controlled input.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: This documentation explicitly instructs users to provide a full Douyin creator-platform session Cookie to a third-party service for processing. A live session cookie is equivalent to account authentication, so collecting and forwarding it creates direct account-takeover and privacy exposure risk if mishandled, intercepted, logged, or reused beyond the stated purpose.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The file documents access to account-private capabilities such as creator live history and user collections, which goes beyond a generic public-data query assistant and expands the skill into authenticated private-account access. That broader scope increases the chance that users will disclose sensitive credentials and that the skill will process private data without clear least-privilege boundaries.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Handling full browser/session cookies is a high-risk credential pattern because such cookies commonly grant the same privileges as a logged-in user. The documentation normalizes transmission of that credential to an external API service, which materially increases the blast radius of compromise compared with standard API-key or token use.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: This section documents anti-bot/signature generators, session/verification token generators, and device registration capabilities that materially expand the skill from passive data querying into infrastructure for emulating platform clients and bypassing access controls. In context, these capabilities can enable unauthorized scraping, automated abuse, or evasion of platform protections, especially when combined with cookie-accepting and content-extraction endpoints elsewhere in the skill.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The high-quality/original video URL extraction endpoints are explicitly positioned for downloading, archiving, and even model-training workflows, which goes beyond a normal query assistant and facilitates bulk content acquisition. That materially increases the risk of copyright infringement, mass exfiltration of creator content, and misuse at scale when paired with batch ID lookup and metadata APIs.

Context-Inappropriate Capability

Medium

Confidence: 85% confidence
Finding: The private-message deep-link capability is not aligned with the declared data-query purpose and introduces an interaction trigger that could be used for spam, harassment, or social-engineering workflows. Even if it only generates a URL, embedding this capability in the skill lowers the barrier to directing users into unsolicited contact flows.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: Describing non-standard anti-bot signature generation as 'equivalent to OAuth' and a 'standard API authentication flow' is misleading and normalizes tooling that is actually meant to reproduce proprietary request-signing behavior. That framing can cause users or downstream agents to treat risky protocol-evasion capabilities as sanctioned authentication, increasing the chance of misuse.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The X-Bogus documentation similarly presents a proprietary signature generator as a standard OAuth-like flow, obscuring the true security and compliance implications of the feature. This misleading representation reduces operator caution around a capability that can support automated scraping and platform-control evasion.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: The session/verification token generators are framed as part of a standard authentication flow despite being protocol tools for constructing platform session identifiers and verification parameters. That contradiction can encourage unsafe use of session-emulation features and understate the risk that they help automate access in ways users may assume are officially supported.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The example prompt triggers are very generic terms such as '视频', '用户', '搜索', and '热榜', which are common words likely to appear in normal user conversations. In agent systems that use prompt-based routing, this can cause the skill to activate unintentionally and access external data sources when the user did not explicitly request this skill, creating confusion, unnecessary external API calls, and possible data exposure through over-broad invocation.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The example trigger words are extremely broad everyday terms such as '视频', '用户', '搜索', and '热榜', which are likely to overlap with normal conversation and cause unintended skill activation. In a skill that can query external data and use authenticated API-backed tools, accidental invocation can lead to unnecessary external requests, user confusion, and unintended data access or disclosure in multi-skill environments.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The file documents authenticated external API calls using a bearer token and multiple query endpoints, but provides no warning that user-supplied keywords, brand names, IDs, and date ranges will be transmitted to a third-party service. In an agent skill context, this can cause unintended disclosure of user data or business-sensitive queries because the agent may forward inputs off-platform without meaningful user awareness or consent.

Missing User Warnings

Low

Confidence: 78% confidence
Finding: The file instructs use of an API key in an Authorization header but provides no guidance about secret handling, redaction, or storage. In agent/tool ecosystems, this increases the chance that credentials are logged, echoed back to users, or hardcoded into prompts and examples.

Missing User Warnings

Medium

Confidence: 78% confidence
Finding: The document describes bearer-authenticated endpoints that operate on user/account identifiers such as kolId, sec_uid, and author_id, but it gives no privacy, consent, retention, or handling guidance. In an agent setting, that omission increases the chance that callers will collect or process personal/profile-linked data without adequate safeguards or user awareness.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This endpoint requires a user web Cookie but lacks the strong sensitivity and third-party handling warning present elsewhere in the file. Inconsistent warnings are dangerous because users may provide authentication material without understanding that it can expose private account data or grant session-level access.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: Some cookie-accepting endpoints in this range request user Douyin web cookies without the same prominent security warnings and transmission disclosures present on other endpoints. This inconsistency increases the likelihood that users will provide live session credentials to a third-party service without understanding that the cookie is effectively a login token.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The document explicitly instructs use of a bearer token (`Authorization: Bearer $MAXHUB_API_KEY`) but provides no guidance on protecting, redacting, or avoiding accidental disclosure of that credential. In an agent-skill context, documented credentials are often copied into logs, prompts, examples, or shared configs, which can lead to token leakage and unauthorized API use.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The document explicitly defines parameters that accept user-provided cookies and browser user-agent values, and repeatedly references default guest cookies or user-supplied web cookies without any warning about sensitive credential handling, retention, or misuse risk. In this skill’s context, those fields can enable authenticated scraping, account-linked queries, or session reuse against Douyin-related services, which increases privacy and account-compromise risk if operators log, store, or forward them insecurely.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal