Workbuddy Add Memory

Security checks across malware telemetry and agentic risk

Overview

The skill is a real memory-management tool, but it asks for and uses broader local access, installation, persistence, and agent-behavior control than its description clearly discloses.

Install only if you are comfortable with a memory skill recursively reading broad WorkBuddy directories, caching indexes, writing local reports, and influencing the agent's workflow. Review and narrow memory_sources before use, avoid pointing it at folders containing credentials or private project files, and treat install_and_test.sh and fix_imports.py as executable maintenance scripts that should be run only after manual review.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (47)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: with open(test_file, 'w', encoding='utf-8') as f: f.write(test_script) os.system(f"cd {skill_dir} && python3 test_fix_result.py") # 清理测试文件 if os.path.exists(test_file):
Confidence: 95% confidence
Finding: os.system(f"cd {skill_dir} && python3 test_fix_result.py")

Lp3

Medium

Category: MCP Least Privilege
Confidence: 90% confidence
Finding: The skill declares no permissions, yet the broader package reportedly uses environment access, filesystem read/write, network, and shell capabilities. This is dangerous because users and platforms may grant trust based on the manifest, while the actual behavior can modify user data, execute commands, or fetch/install code without informed consent. The in-file claims such as '已通过安全审计' and '无系统命令执行/无网络请求' increase concern because they conflict with the reported capabilities rather than reducing risk.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The documented purpose is limited to memory distillation, retrieval, and pre-work recall, but the analyzed package reportedly also installs dependencies, runs shell setup steps, scans user directories, rewrites source imports, enforces usage behavior, hooks normal conversation, and writes reports/reminder files. This mismatch is dangerous because it conceals materially broader authority and side effects, preventing meaningful user consent and increasing the chance of privacy violations, persistence, or unintended workspace modification.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The hidden 'skill_reminder' compliance mode introduces undisclosed agent self-monitoring and workflow enforcement outside the stated memory-management purpose. In this context, that is dangerous because it covertly alters agent behavior and can pressure the system into following rigid tool-invocation paths, increasing susceptibility to prompt-injection-like control through user phrasing such as '@skill' reminders.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The code performs self-policing and emits prescriptive commitments like '绝不忘记' and '立即调用use_skill', which are unrelated to memory retrieval and effectively act as behavior-shaping instructions. This is risky because a skill should not embed hidden policy overrides or coercive workflow controls; doing so can manipulate downstream agent decisions and reduce adherence to higher-priority safety or system policies.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The script prints a success summary stating all checks passed even though earlier checks can fail, only warn, or raise exceptions without changing the final verdict. In a verification script, this creates a dangerous integrity gap: operators or downstream automation may trust a compromised, incomplete, or failed installation because the final output falsely attests success.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: This file performs bulk source rewriting and maintenance actions that are outside the declared scope of a WorkBuddy memory-management skill. Scope mismatch matters because hidden file-modification capability can alter other skill behavior, complicate review, and create an unexpected persistence/modification surface inside an ostensibly benign package.

Context-Inappropriate Capability

High

Confidence: 94% confidence
Finding: The file writes a new Python script and then executes it, providing code-generation-plus-execution capability that is not justified by the advertised feature set. In the context of an agent skill, unexpected code execution is especially risky because it can be repurposed to run arbitrary logic during installation, debugging, or maintenance flows.

Description-Behavior Mismatch

Medium

Confidence: 83% confidence
Finding: The retriever recursively ingests every supported file under configured directories, which can unintentionally pull in unrelated sensitive documents, notes, or configuration artifacts into the searchable memory corpus. In the context of a memory skill, that broad collection scope increases the chance of over-collection and later disclosure through search results, especially because file contents are returned directly to callers.

Context-Inappropriate Capability

Low

Confidence: 89% confidence
Finding: The parser captures full source paths and file timestamps, and later exposes them in returned results. That can leak filesystem layout, filenames, and activity timing to downstream components or users who only needed memory content, creating unnecessary information disclosure.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The memory record shows retrieval from broad local skill backup directories and unrelated operational documents such as GitHub, upload, and SSH setup guides, which exceeds the stated purpose of memory management. This creates an unnecessary data exposure channel where sensitive local content can be indexed, surfaced, or reused in later agent behavior without clear scoping or user approval.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The record exposes absolute local filesystem paths, workspace structure, and environment metadata that are not clearly necessary for the advertised functionality. Even if not directly secret, this information can aid profiling of the host environment, reveal user names and directory layouts, and increase the blast radius of any downstream prompt leakage or logging issue.

Description-Behavior Mismatch

Low

Confidence: 85% confidence
Finding: The script writes a local markdown report containing workspace path details, memory source listings, and excerpts of retrieved memory content. In a memory-management skill, this creates an unnecessary secondary copy of potentially sensitive information, increasing exposure if the filesystem is shared, synced, or later accessed by other tools or users.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The script writes standalone JSON and Markdown files containing workspace paths, memory source information, and previews of retrieved memory content. In a memory-management skill, exporting retrieved content to broad report files increases the chance of unintended data disclosure, especially if memories contain sensitive notes, credentials, internal project details, or personal data.

Description-Behavior Mismatch

Medium

Confidence: 85% confidence
Finding: This module's behavior exceeds the stated memory-management purpose by performing broad workspace inspection and environment probing. Scope expansion is dangerous in an agent skill because users may grant trust based on the manifest, while the code gathers extra system context that can expose sensitive local information and enlarge the attack surface.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The module generates and persists detailed work plans, reports, and execution artifacts that are unrelated to the advertised memory feature set. In an agent context, undisclosed capability expansion increases risk because it can store sensitive task and memory content on disk without users expecting such retention.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The orchestration flow includes generalized task analysis, workspace checks, environment checks, resource preparation, and reporting beyond a narrow memory-management role. In a skill ecosystem, this mismatch is risky because it normalizes hidden capabilities that can collect, infer, and persist unrelated user or system data.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The skill description promises broad automatic handling of shared memory and unified management of all memories, but it does not define precise trigger conditions, exclusions, or consent boundaries. In an assistant context, vague auto-activation can cause the skill to process content unexpectedly, capture unrelated conversations, or act on tasks the user did not intend to delegate to it.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The task-detection language is very broad, covering questions, instructions, and task requests across multiple patterns, which can overlap heavily with ordinary conversation. That makes accidental activation more likely, especially when combined with conversation hooks, leading to unanticipated processing, retrieval, or file-writing actions triggered by casual chat.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The hook loads, indexes, and searches memory sources automatically, but this file provides no user-facing disclosure or consent path. In a memory-management skill, automatic access to historical data is expected functionally, yet it still creates a privacy risk because users may not realize their prior content is being searched and surfaced opportunistically.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: User messages, user IDs, timestamps, and context are stored in conversation history without any notice, retention policy, or consent mechanism. That is a real privacy vulnerability because it accumulates potentially sensitive conversation data and makes later disclosure or misuse more likely, especially when combined with summary and retrieval features.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script executes a shell command without explicit warning, confirmation, or safe-mode controls. Silent execution is dangerous in agent/tooling contexts because users may believe they are only fixing imports, while the script is also launching code, which increases the chance of unintended side effects or abuse.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The workflow explicitly instructs the agent to run local Python scripts and use local tooling, but provides no warning that these actions may execute arbitrary code, access local files, or transmit/store task content. In an agent-skill context, normalizing direct script execution without safety boundaries increases the chance of unintended code execution and data exposure.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill directs the agent to store errors, lessons, and successful experiences in memory via update_memory, but omits any privacy, sensitivity, consent, or retention guidance. This creates a real risk that secrets, personal data, or confidential work information will be persisted indefinitely or inappropriately, especially because the workflow frames memory storage as mandatory.

Missing User Warnings

Medium

Confidence: 81% confidence
Finding: The retriever writes index/state artifacts to disk without any explicit consent or sensitivity filtering, and those artifacts include metadata and potentially content-derived structures from user memories. In a memory-management skill, silent persistence is more dangerous because the data being processed is likely personal or work-sensitive, and local disk caches can outlive the session and be accessed by other local processes or users.

VirusTotal

56/56 vendors flagged this skill as clean.

View on VirusTotal