Security audit

嘴替 Skill

Security checks across malware telemetry and agentic risk

Overview

This prompt-refinement skill is not clearly malicious, but it tries to become a persistent front door for nearly all user requests and includes user-profile behavior without enough control or privacy detail.

Install only if you want this skill to influence nearly every conversation. Review and approve any AGENTS.md or SOUL.md changes manually, keep a rollback path, and leave the learning profile disabled unless users explicitly consent to storing their query-pattern preferences.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (16)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill declares no permissions, yet its metadata and installation instructions indicate it writes to local files such as AGENTS.md and SOUL.md. Undeclared file-write capability is dangerous because it can silently alter agent routing or persistence behavior, reducing auditability and enabling policy changes outside the expected permission model.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The skill claims to be a query-refinement layer, but its described behavior extends into profile accumulation, local asset loading, test generation, and broad routing logic that can influence downstream agent behavior. Description-behavior mismatch is risky because reviewers and operators may approve it under a narrower trust assumption than what it actually does, allowing hidden state changes or control-plane influence.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The activation guide contradicts the skill metadata: the metadata says requests to skip clarification must still go through this preprocessing layer, while the guide says such cases usually should not activate. This inconsistency can cause unreliable routing and policy bypass, letting users evade the intended preprocessing and produce unreviewed or malformed downstream inputs.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file explicitly allows persistent per-user profiling by recording historical query patterns and preferences, which expands the skill beyond prompt preprocessing into behavioral data retention. In the context of a universal preprocessor that handles all user queries, this creates broad privacy and scope-creep risk because sensitive inferences can accumulate across sessions without clear consent, minimization, or retention limits.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The tracked fields such as top domains, tool preferences, output preferences, and clarification tolerance are not necessary to perform one-shot query enhancement and therefore represent unjustified collection of user history. Because this skill is positioned as a mandatory front layer for any user question, unnecessary profiling affects all users and increases the blast radius of misuse or leakage.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: This script constructs and emits a user preference profile keyed by user_id, including domains, tool preferences, and output preferences, which introduces behavioral profiling beyond a narrow query-preprocessing function. In the context of a skill that is triggered for essentially all user questions, this broadens data collection scope and creates privacy and purpose-limitation risk, especially if downstream components persist or aggregate the output.

Vague Triggers

High

Confidence: 94% confidence
Finding: The activation condition of 'any user question' is effectively universal interception, causing the skill to process all conversations regardless of context. A globally triggered preprocessor can override user intent, capture sensitive inputs, and become a chokepoint that rewrites or delays downstream requests, increasing the blast radius of any logic flaw or malicious modification.

Vague Triggers

High

Confidence: 95% confidence
Finding: The hard requirement that all user questions must pass through this skill, combined with instructions to update AGENTS.md with the rule, creates persistent global control over agent input handling. This is dangerous in context because the skill is positioned as an always-on mediation layer, so any bug, prompt injection, or biased rewrite can systematically affect every subsequent task and potentially suppress direct user requests.

Natural-Language Policy Violations

Medium

Confidence: 89% confidence
Finding: The YAML embeds all clarification prompts and the rewrite template in Chinese, which effectively forces a Chinese-language interaction path without any language negotiation or documented locale restriction. This can exclude users, cause misunderstanding of transformed prompts, and lead downstream agents to operate on incorrectly interpreted requirements, though it is not a direct code-execution or data-exfiltration issue.

Natural-Language Policy Violations

Medium

Confidence: 96% confidence
Finding: The file hard-codes user-facing clarify prompts and a rewrite template entirely in Chinese, which can force or strongly bias interaction into Chinese regardless of the user's preferred language. In a universal query preprocessor that intercepts all user prompts, this creates a real safety and usability issue because it can degrade understanding, cause incorrect task reformulation, and exclude users who did not opt into Chinese-language processing.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The template is written entirely in Chinese and appears to steer all rewritten prompts into Chinese output without any user opt-in, locale detection, or documented business justification. Because this skill is a mandatory preprocessing layer for any user question, it can silently change the language of downstream prompts, causing misinterpretation, degraded safety handling, and incorrect task execution for non-Chinese users or downstream systems expecting another language.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The activation conditions are broad enough to match many normal user requests, especially because they include vague triggers like short, colloquial, or professional-domain questions. In a universal preprocessor that runs before downstream routing, overbroad activation can unintentionally intercept benign requests, alter user intent, and expand the skill's control surface across unrelated workflows.

Natural-Language Policy Violations

Medium

Confidence: 91% confidence
Finding: This domain file is entirely written in Chinese and provides only Chinese follow-up and rewrite templates, with no indication that the skill should preserve or adapt to the user's preferred language. In a universal query preprocessor that is supposed to handle any user question, forcing one language can degrade usability, cause misunderstanding of user intent, and create unsafe downstream behavior if prompts are rewritten into a language the user did not request.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The file is entirely authored in Chinese and provides user-facing prompts/templates without any indication that language should follow user preference or explicit opt-in. In a universal query preprocessor, this can override or bias downstream interactions into Chinese, causing loss of user intent fidelity, degraded usability, and potentially unsafe misunderstanding in technical support contexts where precise error details matter.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The document describes recording user history and preference data but does not disclose privacy implications, retention, consent requirements, or how stored data is protected. In a skill that may process every user query, silent collection of behavioral preferences can mislead users and create privacy, compliance, and trust risks if the data is exposed or reused beyond user expectations.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The output contract hard-codes Chinese-only response templates for both clarification and approval flows, without indicating that language should follow user preference or task context. In a universal query preprocessor that sits in front of all user requests, this can cause downstream misunderstanding, incorrect handoff data, degraded accessibility, and failure when downstream systems or users expect another language.

VirusTotal

59/59 vendors flagged this skill as clean.

View on VirusTotal