知乎知识数据采集

Security checks across malware telemetry and agentic risk

Overview

This appears to be a read-only Zhihu data skill, but it needs Review because the instructions include non-Zhihu fallback endpoints and weak guidance for sensitive query and session-token data sent to a third-party API.

Install only if you are comfortable sending Zhihu search terms, user IDs, article/comment IDs, and any optional session token to MaxHub at www.aconfig.cn. Do not provide private session tokens or sensitive personal data unless you understand why they are needed, and review the non-Zhihu fallback instructions before relying on this as a strictly Zhihu-scoped skill.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (10)

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The skill claims to be a Zhihu-only, read-only data assistant, but elsewhere documents fallback behavior and routes for Douyin/Xiaohongshu. This mismatch broadens the effective capability surface and can mislead users, reviewers, and downstream agents into invoking unintended third-party endpoints under a narrower trust assumption.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The top-level description presents the skill as Zhihu-only, yet the body includes operational guidance for other platforms' API paths. This deceptive or inconsistent scoping can cause inappropriate permissioning and unsafe reliance on the stated purpose when the skill's instructions support a wider set of actions.

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The security declaration asserts purely read-only data querying, but explicitly normalizes endpoint classes such as encrypt, decrypt, generate, signature, fingerprint, and token. Those capabilities are not equivalent to simple content retrieval and may enable authentication material handling or protocol utility functions that expand risk beyond read-only analytics.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The README advertises extremely generic trigger examples such as '用户', '资料', '搜索', '热榜', '文章', and '专栏', which are common words in normal conversation. In systems that activate skills based on prompt matching, these broad phrases can cause unintended invocation of this skill during unrelated chats, potentially sending user queries to an external Zhihu data service and exposing context the user did not intend to share.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The routing logic uses very broad trigger words like 'user', 'search', 'topic', and 'analyze', which can match ordinary conversation and unintentionally activate networked data retrieval. Ambiguous activation increases the chance of unprompted external calls or over-collection beyond what the user clearly requested.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation states that requests go to an external domain and require a bearer token, but it does not warn users that article IDs, user identifiers, search keywords, and other query data will be transmitted to a third-party service. In an agent setting, that omission can cause unintentional disclosure of user-derived data or secrets through tool calls because operators may assume the API is local or privacy-neutral.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The document instructs use of a bearer token and an external HTTPS endpoint but provides no guidance on protecting API keys or on the privacy implications of sending search/query data to a third-party service. In agent ecosystems, this omission can lead to unsafe credential handling, accidental logging of secrets, or transmission of sensitive user prompts and search terms without informed consent.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The `session_token` parameter is likely sensitive because it can identify or resume a user session, yet the documentation gives no warning about secrecy, reuse, logging, or privacy impact. In a skill context, this increases the chance that agents or developers treat it as ordinary data and expose session material in logs, prompts, or analytics systems.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This documentation exposes endpoints for retrieving user profile data and social-graph information such as followers, followees, followed topics, and collections, but provides no privacy, consent, or acceptable-use warning. In an agent skill context, that omission increases the likelihood that downstream agents or users will automate bulk profiling, scraping, or deanonymization without considering legal or platform-policy constraints.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The documentation exposes a `session_token` parameter for `fetch_hot_recommend` without any warning that it may contain sensitive authentication material or guidance on secure handling. In an agent skill context, this increases the chance that callers pass, log, persist, or echo live session credentials, which could enable account/session misuse if leaked.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal