Mulch

Security checks across malware telemetry and agentic risk

Overview

The main Mulch memory workflow is coherent, but the package also includes broader, under-disclosed behaviors that should be reviewed before installation.

Install only if you are comfortable with a persistent, git-tracked project memory workflow and you can review or remove the bundled visual-explainer directory. Prefer project-local hooks with restrictive matchers, avoid global always-on hook setup, disable or ignore Telegram notification configuration unless you explicitly want it, and review generated learnings before committing them to a repository.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (35)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 92% confidence
Finding: The skill is presented as a simple self-improvement/reminder mechanism, but the described behavior extends into auto-detection, hook-based inspection of tool output, skill extraction, benchmarking, and external notifications. That gap reduces informed consent and can cause operators to enable broader data collection or execution pathways than they expected, increasing the chance of privacy leaks or unsafe automation.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The preloaded domains significantly expand the scope of data the skill is prepared to capture, including customer, marketing, sales, competitors, crypto, and system-specific categories that go beyond a narrow self-improvement/coding-assistance purpose. In a memory- or learning-oriented skill, this broader taxonomy can encourage retention and organization of sensitive business, personal, or profiling data without clear minimization boundaries, increasing privacy and misuse risk across sessions.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: Enabling Telegram notifications introduces outbound communication behavior that is not described in the skill metadata, creating a hidden data egress path. For a self-improvement skill that captures learnings across sessions, notifications could expose prompts, errors, corrections, or other sensitive operational details to an external channel, making the undeclared integration especially risky.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document's security section materially misrepresents behavior by claiming the scripts only output text and do not run commands, even though the configured hook type is explicitly `command` and the guide also instructs users to invoke an extraction script that creates files. Misstating execution semantics can cause operators to underestimate trust and permission boundaries, increasing the chance they enable hooks in sensitive environments without proper review.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The prompt explicitly instructs the agent to mine conversation history and local cross-session memory files for rationale, even though that data may contain sensitive project context, prior user inputs, or unrelated secrets. In this skill context, that broad data access is not necessary for producing a diff review and increases the chance of unintended disclosure into the generated HTML or subsequent tool outputs.

Context-Inappropriate Capability

Medium

Confidence: 83% confidence
Finding: This prompt performs broad repository analysis, documentation checks, architecture review, and visualization tasks that exceed the stated self-improvement/Mulch learning-capture purpose. While not inherently malicious, this scope expansion violates least privilege and can cause the skill to inspect far more code and metadata than users would expect from the parent skill, increasing exposure of sensitive repository contents.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The prompt authorizes optional use of an external CLI to generate AI images, which can trigger networked behavior and expand the skill's operational scope beyond simple diagram generation. This increases attack surface through external service interaction, possible data exfiltration of prompt content, and non-deterministic side effects that are not clearly bounded by the skill's stated purpose in the provided metadata.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This file's behavior is materially inconsistent with the declared 'self-improvement' skill purpose: it loads a different skill, generates standalone HTML, writes files under the agent directory, and opens a browser. That mismatch is dangerous because users or orchestrators may grant trust or permissions based on the manifest description, while the embedded prompt performs unrelated actions with filesystem and UI side effects.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The prompt directs the agent to do more than generate a review: it persists output to a fixed directory under the user's home and launches a browser. Those are side effects not necessary for producing HTML, and they can surprise users, expose sensitive plan/code details in a predictable location, and trigger unintended local actions.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Invoking `surf gemini --generate-image` introduces an unnecessary external capability for a plan-review task. This can send derived information about the codebase or plan to an outside tool/service, increase the attack surface, and create nondeterministic behavior unrelated to the core review function.

Context-Inappropriate Capability

Low

Confidence: 96% confidence
Finding: Automatically opening a browser is an unjustified capability escalation for a prompt whose stated purpose is to generate a review artifact. Even if low severity on its own, it can cause unexpected execution flow, expose local paths/data to browser history or extensions, and violates least surprise for a documentation task.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: The prompt authorizes use of an external image-generation CLI (`surf gemini --generate-image`) for a recap task that can be completed from local project artifacts alone. This expands the skill's data exposure and execution surface unnecessarily, and depending on how the tool is configured, project context or prompts could be sent to an external service without clear user consent.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Instructing the agent to mine same-session conversation history broadens collection beyond repository artifacts into potentially sensitive user inputs, credentials, or unrelated task context. For a project recap, this creates unnecessary overreach and can cause confidential chat content to be surfaced in generated output or fed into downstream tools.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The auto-detection logic is triggered by generic conversational patterns such as corrections, retries, and failures without clear scope boundaries. Overbroad triggers can capture normal user dialogue and operational context as 'learnings,' causing unintended persistence of sensitive, misleading, or low-quality information in the shared memory store.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: Using everyday phrases as examples for activation without contextual guards encourages implementations that treat common conversation as a signal for persistence. In practice, this can lead to accidental logging of user corrections, speculation, or private details into git-tracked project memory where they may later be surfaced or shared.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill states that users are notified via Telegram when a learning is recorded, but it does not warn that data may be transmitted to an external service or describe what content leaves the local environment. Any external notification path can leak sensitive project names, errors, prompts, paths, or learnings if payloads are not strictly minimized and disclosed.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The dedicated detection-trigger section again defines broad conditions like corrections, missing capability, or better approach as automatic record candidates, which are common during normal agent use. In a git-tracked append-only store, such broad activation increases the risk of retaining sensitive workflow details, user intent, or incorrect intermediate conclusions beyond their intended scope.

Missing User Warnings

Medium

Confidence: 74% confidence
Finding: The script-oriented template encourages inclusion of executable helpers but does not require authors to document side effects, required privileges, network access, file mutations, or handling of sensitive data. In an agent skill ecosystem, that omission can cause users or agents to invoke scripts with incomplete understanding, increasing the risk of destructive actions, credential exposure, or unsafe automation.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: Using an empty matcher on `UserPromptSubmit` makes the hook fire for every prompt, creating unnecessary persistence and exposing all session input to the hook path regardless of relevance. In a self-improvement skill, this broad trigger is more dangerous because it normalizes always-on interception of user activity and increases the blast radius if the script is changed or compromised.

Vague Triggers

High

Confidence: 96% confidence
Finding: The guide recommends user-level global activation with an empty matcher, causing the hook to run across all projects and prompts rather than within a reviewed project boundary. This materially increases risk because a single installation creates cross-project session persistence and broad data exposure, including unrelated repositories and potentially sensitive prompts.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The 'minimal setup' still uses an empty matcher, so the hook remains effectively unconditional despite being framed as lower overhead. Presenting this as minimal may mislead users into thinking risk is reduced when the main security issue—always-on triggering—remains unchanged.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The Codex example repeats the empty matcher pattern, extending broad prompt interception to another provider and encouraging insecure defaults across ecosystems. Because this is documentation intended for copy-paste setup, the insecure matcher is likely to be propagated widely without users understanding the scope of activation.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The automatic activation rule for 'about to dump a complex table in the terminal (4+ rows or 3+ columns)' is underspecified and likely to match many normal agent responses. That can lead to surprising behavior such as generating HTML files and launching a browser without clear user intent, which is risky in security-sensitive or automation-heavy environments.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The automatic activation rule for 'about to dump a complex table in the terminal (4+ rows or 3+ columns)' is underspecified and likely to match many normal agent responses. That can lead to surprising behavior such as generating HTML files and launching a browser without clear user intent, which is risky in security-sensitive or automation-heavy environments.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The skill's invocation criteria are very broad, covering many common tasks like diagrams, comparisons, recaps, and even proactively replacing some terminal tables with HTML. This can cause the skill to activate in situations where the user did not clearly ask for file generation or browser-based output, increasing the chance of unintended side effects and mode switching.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal