蜂兵虾将

Security checks across malware telemetry and agentic risk

Overview

This skill is not clearly malicious, but it stores and reuses user history, workflow data, and personal profile inferences more broadly than its safeguards explain.

Review before installing, especially if you may use it with confidential business data, healthcare or financial topics, personal goals, or workflow records. Disable or constrain persistent memory, shared sync, background preparation, and confirmation-skipping where possible, and treat stored profile and goal data as sensitive.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (33)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 92% confidence
Finding: The skill markets itself as a content/trend automation assistant, but the documented behavior expands into persistent memory, profiling, workflow retention, and references to external scripts and local storage behaviors that are not clearly disclosed in the top-level description. This mismatch is dangerous because users may consent to a lightweight reporting skill while unknowingly enabling broader data retention or execution-related capabilities, undermining informed consent and increasing the attack surface.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill metadata promises broad automated monitoring, content generation, trend analysis, scheduled reporting, and automatic execution across many industries, but this file only demonstrates local console logging and hard-coded agent routing. That mismatch is dangerous because users may overtrust the system, deploy it in sensitive domains, or assume monitoring/execution/reporting is occurring when it is not, leading to missed events, bad decisions, or unsafe operational reliance.

Intent-Code Divergence

Low

Confidence: 83% confidence
Finding: The comments and UI text describe autonomy, dynamic collaboration, and continuous learning, but the implementation uses fixed branching and stores experiences without using them to influence future behavior. While not code-execution dangerous by itself, this can mislead users about the system's capability, reliability, and level of oversight, especially in contexts where people may defer to purportedly intelligent behavior.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code analyzes user motivations across psychological dimensions and persists the results to memory, creating a form of personal psychological profiling unrelated to the stated business-monitoring purpose. In this context, undisclosed collection and retention of sensitive inferred attributes increases privacy risk, enables secondary misuse, and violates user expectations about what the skill should do.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The module generates blind-spot assessments, mirror letters, and future-self predictions, which are highly personal evaluative outputs outside the declared scope of industry monitoring and content automation. These features can manipulate user decision-making, create sensitivity around mental-state inference, and store intimate reflections without clear necessity or disclosure.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The code queries goal and value data from memory and synchronizes it into other systems ('signal' and 'workflow') without any visible access control, consent check, or purpose restriction. Cross-system propagation of sensitive personal data materially expands exposure, increases the chance of unauthorized downstream use, and is especially concerning because the skill is marketed for broad industry automation rather than intimate profiling.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The comments claim the example only demonstrates usage and does not change the original system, but the code later writes monitored hotspot data and duty-person names into the shared memory store. This mismatch can cause operators to run the example under false assumptions, leading to unintended persistence of potentially sensitive operational data.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file is effectively a static demo that prints marketing-style claims about a memory system rather than implementing the advertised capabilities. In a skill promoted as capable of automated monitoring, trend analysis, and execution across sensitive industries, this misrepresentation can cause operators to rely on nonexistent functionality, leading to missed events, bad decisions, or unsafe automation assumptions.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The code claims 'semantic search' and 'domain constraints' but only performs simple credibility/success-rate filtering and score sorting, without query matching or scenario enforcement. In a system marketed for operational monitoring and decision support, this can produce irrelevant or overbroad results while giving users unjustified confidence in the retrieval logic.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The file states that it supports bidirectional validation, feedback loops, and dynamic importance adjustment, but no state mutation or recording logic exists—only printed examples. This is dangerous because users may believe the system learns from outcomes and self-corrects, when in reality failed actions or bad guidance will not reduce future risk.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file documents a '状态洞察模块' that analyzes energy allocation, growth trajectory, emotional state, and forward-looking insights, then stores those outputs in L3/L4 memory for long-term tracking. For a skill advertised primarily as hotspot monitoring, content creation, trend insight, and work logging, this is an unjustified expansion into sensitive behavioral profiling and creates privacy and misuse risks well beyond the stated purpose.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The retention table specifies 90-day L3 storage, 365-day L4 'wisdom memory,' and permanent L3 storage for workflow knowledge. Persisting user history and derived long-term profiles far exceeds the skill's described reporting/automation function and increases the blast radius of data leakage, secondary use, and unauthorized profiling.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: Emotional-state analysis and growth-trajectory modeling are sensitive profiling capabilities that are not necessary for content/workflow automation. In this skill context, they enable covert inference about a user's mental state and behavior patterns, which can be used to manipulate recommendations or expose highly personal information if mishandled.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The documented query of L3-L4 user history for goals and insights creates a long-term profiling mechanism unrelated to the stated business automation purpose. Because the skill claims broad applicability across many industries, including potentially sensitive ones like healthcare and finance, this unjustified retrieval materially increases privacy, compliance, and misuse risk.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The file is presented as a demo of time-decay logic, but it also performs a state-changing write into the system's long-term memory layer via ai.memory.addToL3(). In an agent skill context, hidden persistence is security-relevant because running an example script can silently alter future agent behavior, poison retained configuration, or create hard-to-trace side effects beyond the current session.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The prompt explicitly directs broad collection of information from multiple public platforms and sectors, but it does not disclose privacy, consent, terms-of-service, or data-handling boundaries to the user. Even if sources are public, large-scale aggregation and profiling can expose users and third parties to privacy and compliance risks, especially when paired with relevance scoring and automated reporting.

Missing User Warnings

High

Confidence: 97% confidence
Finding: This module instructs the system to record execution details, user decisions, outputs, and to retain and reuse them in templates and a knowledge base, but provides no notice, consent, retention limit, or access control requirements. That creates a real risk of sensitive user inputs, proprietary workflows, or personal data being stored and repurposed beyond the user's expectations.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly describes building user profiles, learning preferences, and predicting next actions, but it does not provide a clear privacy notice, consent flow, or explanation of what data is collected, how long it is retained, and how it is used. In a skill intended for broad industry use, this creates privacy and compliance risk because users may be profiled across sessions without meaningful awareness or control.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill advertises scheduled automatic reports and elsewhere describes background preparation, but does not clearly warn users that it may act proactively without an immediate request. This can surprise users, lead to unintended data processing, and cause actions to occur outside the user's active review window.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The document explicitly describes a user-adaptation behavior to 'skip confirmations' based on learned completion patterns, but provides no guardrails, scope limits, or warning that high-risk actions must still require explicit consent. In the context of an 'automatic execution' multi-agent skill intended for broad industry use, this can normalize silent execution of consequential actions and increase the chance of unsafe or unauthorized operations.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The module persistently writes memory contents to several local files (L1/L2/L3/L4 and shared sync files) without any consent, minimization, encryption, or access-control mechanism. Because the stored values are arbitrary and may include prompts, user inputs, business data, or sensitive context, local persistence creates a real confidentiality and privacy risk if the host, logs, backups, or neighboring components are accessed.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Sensitive goal, motivation, blind-spot, and predictive self-assessment data is persisted across multiple memory tiers and later synchronized, yet this file shows no notice, consent flow, retention boundary, or user control. Because the skill's stated purpose does not prepare users for this depth of personal data handling, the undisclosed persistence creates a meaningful privacy and trust violation with potential downstream harm if reused or exposed.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The reset instructions tell the user to delete the memory directory but do not clearly warn that this action irreversibly destroys stored data. In an agent/automation context, such guidance can normalize destructive operations and lead to accidental loss of user memory, logs, or historical records.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The script stores evaluated hotspots, statistics, and duty-person names in shared memory without any notice, consent, retention policy, or access-scope explanation. In operational contexts, monitored topics and staffing identities may be sensitive, so silent persistence increases the risk of privacy leakage, cross-user data exposure, or inappropriate reuse by other agents.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The document describes long-term storage and tracking of user state data but provides no user-facing warning, consent flow, or retention explanation. This lack of transparency prevents informed user choice and makes sensitive profiling more dangerous because users may not realize their emotional or historical data is being retained.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal