Secretary Memory

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real memory-management skill, but it stores and reuses sensitive conversation data and can generate or modify future skills with weak approval boundaries.

Install only if you intentionally want a long-term local memory system that stores conversation-derived data. Configure the memory directory carefully, avoid storing secrets or sensitive third-party data, disable or review external LLM summarization, do not enable daemon or cron modes casually, and manually inspect any generated skill or trigger before using it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (46)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill documents capabilities equivalent to file read/write, shell execution, environment access, and likely network use, but declares no permissions or trust boundaries. That mismatch prevents informed consent and safe policy enforcement, making it easier for a memory-management skill to perform broader system actions than a user would reasonably expect.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The documented behavior goes beyond memory recall into real-time filesystem monitoring, automatic file moves, symlink creation, and OS-level process inspection. Those actions materially expand the attack surface and can alter local files or observe system state in ways not justified by the top-level description, which is dangerous because users may invoke the skill under a narrower trust assumption.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: A skill presented as a memory-management tool also includes capability to generate entirely new skills, which is a significant scope expansion. Self-extension features are risky because they can create new executable artifacts and persistence mechanisms outside the user's original intent or review path.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Automatic skill generation is not necessary for a secretary-style memory system and introduces code/content generation with persistence. In this context, that makes the skill more dangerous because repeated normal conversation patterns could trigger creation of new automation logic that the user never intended to install.

Intent-Code Divergence

Medium

Confidence: 82% confidence
Finding: The module header says it performs append-only preference extraction, but the implementation also mines contact data and forwards full session text into a graph integration. This mismatch undermines transparency and informed consent, making operators more likely to deploy the tool without realizing it performs broader personal-data collection and profiling.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The contact extraction patterns intentionally pull third-party names, emails, and roles from arbitrary session text, not just the user's own preferences. In a memory skill, conversation logs often include sensitive third-party information, so automatically persisting these fields creates privacy and surveillance risk beyond the stated purpose of preference personalization.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: This sub-skill expands from memory management into autonomous creation and registration of new skills, effectively enabling capability growth beyond the parent skill's declared scope. In an agent environment, self-extension is dangerous because it can introduce unreviewed logic, broaden tool access, and create persistence or trigger pathways that operators did not explicitly authorize.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The documented self-improvement and new-skill generation features create a recursive modification pathway where the system can alter or expand its own behavior over time. That is risky in a memory skill because it turns a data-management component into a capability-evolving agent, increasing the chance of unsafe automation, policy drift, or covert persistence.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: This script expands from skill generation into scanning recent session memory for 'worth remembering' personal or behavioral content, then resurfaces that content as reminders. In a memory-management skill, that behavior materially increases privacy risk because it processes and reuses user conversation data without clear consent, minimization, or sensitivity filtering.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The self-improvement component does more than analyze usage: it directly modifies generated skill files and metadata based on stored feedback. Any system that rewrites executable or instruction-bearing assets from loosely validated feedback creates an integrity risk, especially because it can change future behavior without review.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: This is a generic code and file generator that creates new skill directories, markdown instructions, and executable Python scripts from user-supplied parameters. In the context of a memory skill, automatic generation of executable artifacts is especially risky because it can turn conversation-derived or externally influenced input into new runnable components that may later be trusted and executed.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: `ProfileMinerIntegration.process_session()` calls `self.model.extract_entities(...)` and other methods, but `self.model` is left as `None` in `__init__` and never initialized. In practice this causes a crash on the session-processing path, which can disable profiling updates or any workflow depending on them, creating a reliable denial-of-service condition rather than direct code execution.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The trigger conditions are broad enough to overlap with ordinary conversation about memory, search, or architecture, increasing the chance of accidental activation. Because the skill performs persistent storage, recall, and potentially skill generation, false triggering can cause unintended data retention or side effects from benign user prompts.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The skill automatically extracts preferences, builds a relationship graph, and performs cross-session recall without presenting clear privacy notices, consent boundaries, retention limits, or handling rules for sensitive data. In a memory system, this context makes the issue more serious because the entire feature set is built around persistent accumulation and reuse of user-derived information.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The auto skill-generation section lacks risk warnings around creating new skills and running related tasks on a schedule. That omission is dangerous because generated skills can expand capabilities and persistence over time, especially if triggered by heuristics rather than explicit, reviewed user requests.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The spec explicitly defines automatic extraction and persistent storage of user preferences, habits, and contacts, including a '只增不改' retention model, but does not describe user consent, notice, minimization, editing, or deletion controls. In a memory skill, this creates a real privacy and compliance risk because sensitive personal data can be accumulated indefinitely and later surfaced across sessions or contexts.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The spec states that every session is automatically appended to daily logs, with summaries and key decisions retained and later archived, but provides no user warning or consent flow for ongoing retention. In a cross-session memory system, silent logging increases the chance that sensitive, incidental, or regulated information is stored without the user realizing it, expanding privacy exposure over time.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The loader is explicitly designed to automatically recall stored memories at session start and inject their contents into the prompt. That creates a real confidentiality risk because previously stored notes, preferences, relationship data, or other sensitive content can be surfaced to the model or downstream tooling without an explicit user-facing disclosure or per-session consent, especially in a memory-management skill whose purpose is cross-session recall.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The code reads a memory directory from an environment variable and searches local stored content automatically, which can include sensitive data outside the immediate conversation context. In this skill, that is more dangerous because the overall design emphasizes broad, cross-session memory recall and automated loading, increasing the chance that private local data is processed and inserted into prompts without clear transparency or scope restrictions.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This script recursively searches memory partitions and prints matched file paths and content snippets directly to stdout, including profile, agenda, projects, daily logs, and archived decisions. In a memory-management skill, those stores are likely to contain sensitive personal or organizational data, so automatic recall without access controls, redaction, consent gating, or even a privacy warning materially increases the chance of unintended disclosure to the model, logs, or downstream consumers.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The LLM summarization path builds a prompt from indexed memory results, including file paths and content excerpts, and sends that prompt to a configured remote HTTP API. In a memory system, those indexed notes can contain sensitive personal, organizational, or historical data, so transmitting them externally without explicit user consent, clear disclosure, redaction, or allowlisting creates a real confidentiality risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: This code writes mined preferences, habits, and contacts from session text into persistent files without any explicit notice, consent flow, retention control, or data minimization. Because the source text is conversational and unstructured, users may reveal personal or sensitive information incidentally, which then becomes long-term stored profile data.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The script passes full session text to a user-model integration for graph processing, expanding processing beyond simple local extraction. Even if the integration is local, this is additional profiling of conversational content and may capture sensitive relationships or entities without the user's awareness.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The code automatically sends retrieved memory search results to an LLM via `self.llm.summarize(results, query)` without any explicit user consent, disclosure, or sensitivity filtering. In a memory system, search results may contain private cross-session notes, preferences, relationship data, or other sensitive content, so silent transmission to an external model/API can create a confidentiality leak.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script persists user-provided session text and extracted summaries to daily markdown files without any explicit consent flow, warning, retention notice, or sensitivity filtering. In a memory-management skill, this creates a real privacy and data-governance risk because users may provide secrets, personal data, or confidential work content that is then stored on disk by default.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal