xhs-comment-scraper

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed Xiaohongshu comment scraper; its main risks are privacy, local files, and a logged-in browser session, not hidden malware.

Install only if you intentionally want to scrape the submitted Xiaohongshu profile and save commenter data locally. Confirm the exact profile before running, avoid elevated privileges, use a dedicated browser/profile if possible, respect platform rules and privacy laws, and delete the generated JSON/HTML files and browser session when no longer needed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (8)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 87% confidence
Finding: The skill instructs the agent to write scraped comment data and generated reports to local disk, but the metadata declares no corresponding permissions. Undeclared file I/O weakens user/admin visibility into what the skill can persist locally and increases the risk of silent data collection or unintended storage of sensitive content.

Context-Inappropriate Capability

Medium

Confidence: 82% confidence
Finding: The skill goes beyond scraping by directing creation and execution of local analysis scripts and launching local programs to open reports. Executing generated code and spawning local processes expands the attack surface: scraped or user-controlled content could be embedded into scripts or HTML reports, leading to unsafe local execution or misleading file behavior.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README encourages automatic bulk scraping of all notes and comments from a profile and saving the results locally, but it does not clearly warn users about the scale of collection, the sensitivity of comment data, or the fact that local files will be written automatically. This creates a meaningful transparency and consent problem: users may trigger broad collection and persistent storage without understanding the privacy, compliance, and data-handling implications.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The README describes scraping and saving all comments from a Xiaohongshu profile to local JSON files and generating analysis reports, but it does not warn users that this collects, stores, and processes third-party data. That omission can lead users to unknowingly perform privacy-invasive collection or mishandle personal data, especially since the skill targets bulk extraction across all notes rather than a single page.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly automates bulk collection of comments from all notes on a Xiaohongshu profile and stores the results locally as JSON, including usernames, comment text, timestamps, and engagement data. Even though this is framed as a scraping utility rather than overt exfiltration, it creates privacy and compliance risk because it facilitates large-scale retention of third-party user content without clear consent, minimization, retention limits, or warnings to the operator.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The trigger condition uses the vague phrase 'or similar blogger homepage links,' which can cause the skill to activate on unintended URLs. In a browser-automation scraper, overbroad triggering is risky because it may initiate login prompts, navigation, scraping, and local data storage without sufficiently precise user intent.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill is designed to collect large volumes of user comments, store them locally as JSON, and generate analysis reports, but it does not provide a clear privacy warning or data-handling disclosure. Even if the comments are publicly visible, bulk aggregation, local retention, and analysis materially increase privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The script writes all scraped comment data directly to a local JSON file, likely including usernames, comment text, and other potentially sensitive personal data, without any user warning, minimization, or retention controls. In the context of a bulk scraper targeting all comments from a creator's notes, this increases privacy and compliance risk because large amounts of third-party data are persistently stored on disk by default.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal