xhs-creator-copilot

Security checks across malware telemetry and agentic risk

Overview

This is a local-only creator drafting skill, but it includes detailed guidance for making AI-written social media content look less AI-generated and some privacy-sensitive local retention workflows.

Review this skill carefully before installing. It does not appear to contain hidden code, credential theft, platform automation, or destructive behavior, but you should not use its humanizer to hide AI authorship or invent personal trading experience. Treat all generated finance content as needing human review, required AI disclosure, and investment-risk disclaimers. Be cautious when pasting DMs, screenshots, or notification pages, because the skill may save outputs and optional FAQ examples locally.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (16)

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The README gives concrete post-ban operational advice such as changing device, IP, and browser profile to 'cold start' a new account. Even if framed as compliance guidance, this meaningfully assists platform enforcement evasion and is unrelated to the stated local drafting-only purpose of the skill. In context, that makes the skill more dangerous because it moves beyond content assistance into account-survival tactics after sanctions.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: The document explicitly tells the operator that the reply does not need AI-content labeling, while the skill metadata claims strict adherence to AI-content marking rules. Even if the author believes private messages/comments are outside a mandatory scope, embedding a categorical 'no labeling needed' instruction can lead users to rely on a potentially incorrect compliance interpretation and systematically omit disclosures.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: Although the document explicitly says the module is not for bypassing AI detectors, it repeatedly ties rewriting to reduced detection risk, reduced 'AI味', and more human-looking output. That creates a practical detector-evasion playbook and may encourage users to conceal AI authorship despite the stated compliance language.

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The file claims final outputs automatically append an AI disclosure and that users must not remove it, but the later workflow and examples omit that enforcement and present copy-ready rewritten text. This inconsistency can lead to undeclared AI-assisted content being produced or copied without the promised disclosure safeguard.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The broader skill positioning emphasizes compliant local drafting, but this module focuses on removing AI-writing signals and reducing platform detection-related risk. In context, that shifts the tool from neutral editing toward concealment of AI assistance, which undermines the claimed compliance model.

Natural-Language Policy Violations

Medium

Confidence: 85% confidence
Finding: The feature description explicitly advertises making AI-written text 'more like a human wrote it.' Even though the README later disclaims bypass intent, this framing promotes concealment-oriented transformation that can be used to weaken provenance cues and frustrate platform or user expectations around AI-generated content. In a social-platform creator tool, that context increases risk because the rewritten output is intended for public posting.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The README cites an 'AI trace score' dropping from 78 to 32 with a target threshold, which operationalizes reduction of machine-generated indicators. That creates a measurable optimization target for concealment, encouraging users to tune content against detection heuristics even if they are told to disclose AI use elsewhere. This makes the capability more dangerous because it turns a vague editing aid into a repeatable evasion workflow.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The file explicitly instructs the skill to append users' final sent replies and question/answer pairs into a persistent local archive, which can retain sensitive conversational content over time. Although the document mentions some automatic desensitization later, it does not clearly warn users at the point of collection about privacy risks, retention scope, or the possibility that pasted custom replies may still contain personal or confidential information.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The trigger phrases for persisting content to the FAQ library include broad everyday expressions like '记下来', which can be said conversationally without informed intent to store data. This creates a risk of unintended persistence of private fan messages or generated replies, especially in a DM/comment workflow where content may contain personal or sensitive information.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The workflow specifies automatic file output and optional appending to a persistent FAQ library, but the document does not clearly warn users at the time of action that conversation content will be written to disk. In a tool handling fan comments, private messages, and screenshots, silent persistence increases privacy and data-retention risk because users may unknowingly store personal data locally.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The humanizer is configured to run automatically across multiple content-generation paths without narrow gating, which can apply detector-evasion-style rewriting more broadly than the user explicitly requested. That increases the chance of non-transparent AI-assisted content being systematically transformed into apparently human-authored text.

Ssd 2

Medium

Confidence: 88% confidence
Finding: The README instructs users that the tool performs '去 AI 化润色,' i.e., paraphrasing to reduce signs of AI authorship and make text appear more naturally human. This is a classic evasion-adjacent pattern: even without direct platform automation, it helps obscure the origin of generated content and may undermine policy enforcement or audience transparency. Because the skill is specifically for social-media publishing, the misuse path is straightforward.

Ssd 2

Medium

Confidence: 94% confidence
Finding: The example workflow describes identifying '24 classes of AI writing traces,' rewriting each hit, and driving an AI-trace score below a target. That is effectively a recipe for systematically removing detection features from generated text. The surrounding compliance language reduces but does not eliminate the risk; the concrete workflow still teaches users how to conceal machine-generation characteristics.

Ssd 4

Medium

Confidence: 88% confidence
Finding: The narrative says the goal is not to beat detectors, yet it normalizes rewriting because unhumanized content may be limited, downgraded, or reported as AI. That framing still instructs users to adapt content specifically to avoid AI-identification outcomes, which is a form of evasion guidance.

Ssd 4

Medium

Confidence: 94% confidence
Finding: The document provides detailed, operational instructions for systematically removing recognized AI-writing markers, adding human-like imperfections, injecting personal-emotion cues, and scoring outputs by perceived humanness. Those steps materially enable concealment of AI-generated origin and can be misused to bypass platform policy enforcement or mislead readers about authorship.

Ssd 3

Medium

Confidence: 84% confidence
Finding: The text-mode fallback instructs the user to copy and paste an entire notification page into the skill, causing bulk ingestion of all visible commenter data, including nicknames, timestamps, and comment content. Even in a local-only tool, this increases unnecessary collection of third-party personal data and widens the blast radius for retention, logging, or downstream reuse, especially because the same document later supports persistence into a FAQ library.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal