Memory Optimization

Security checks across malware telemetry and agentic risk

Overview

This memory skill is not clearly malicious, but it needs Review because it can read past agent sessions, send memory content to external model APIs, and persist or mutate memory data with under-scoped controls.

Install only if you are comfortable with a memory tool that may inspect prior agent sessions, infer durable preferences, write persistent KG/cache files, and send selected memory or transcript content to configured model APIs. Review or remove the gstack and subagent guidance, prefer dry-run modes first, restrict API keys and KG paths, and avoid running extraction or consolidation on sessions containing secrets unless you add redaction and explicit consent controls.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (44)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill advertises substantial capabilities including file read/write, shell execution, environment-variable use, and networked API access, but does not declare permissions or boundaries. That creates a trust and review gap: users may invoke it expecting simple local memory management while it can access credentials, modify files, and send data externally.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The documented purpose is memory optimization, but the manifest also describes broader telemetry, benchmarking, session-history scanning, preference inference, embedding-based processing, archival management, and network-backed extraction. This mismatch can cause operators to approve or deploy the skill under a narrower risk assumption than its actual behavior, increasing the chance of unauthorized data collection or destructive actions.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file broadens a memory-management skill into general external-tool orchestration by directing use of an unrelated gstack web-browsing ecosystem. This increases the skill's authority and attack surface beyond its stated purpose, making it easier to trigger unnecessary networked actions or pull in unreviewed capabilities under the guise of memory optimization.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The instructions tell the agent/user to clone and set up an external GitHub repository even though that behavior is not necessary for the advertised memory-optimization function. Pulling and installing third-party code creates a supply-chain and overreach risk, especially when embedded in routine repository guidance where it may be followed automatically or without scrutiny.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: Skill-usage tracking and session-history scanning go beyond core memory organization and can capture behavioral metadata about users and agents. In a memory skill context this is more sensitive because users may not expect operational telemetry to be collected alongside notes and decisions.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Documenting external LLM API configuration for KG extraction means local memory and session content may be transmitted to a third-party service. In a memory-management skill, that materially changes the data exposure model because stored memories often contain sensitive project history, decisions, and user preferences.

Context-Inappropriate Capability

Medium

Confidence: 78% confidence
Finding: The schema explicitly supports storing personal contact data such as email and phone numbers, which is not clearly necessary for a memory-optimization skill. In agent memory systems, unnecessary PII fields increase the chance of over-collection, long-term retention, accidental exposure in summaries, or misuse by downstream components that treat the ontology as approved storage.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: This file defines a broad autonomous organizational structure with cognitive, technical, social, and economic departments, which is materially unrelated to the stated purpose of memory optimization. In a skill that should manage summaries, memory files, and retention workflows, introducing generalized subagents expands authority and behavior scope, creating unnecessary capability creep and increasing the chance of misuse, prompt injection propagation, or unauthorized actions under the guise of memory management.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The configuration grants or contemplates capabilities for development, testing, deployment, social interaction, finance, market analysis, and compliance inside a memory-management skill without a clear functional need. This mismatch is dangerous because attackers or downstream agents could leverage the extra capability surface to perform operations far beyond memory handling, especially where tool permissions, coordination rules, and emergency escalation mechanisms are already defined.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: This file defines a full hierarchical multi-agent command-and-control structure that does not align with the stated purpose of a memory-optimization skill. In a skill package, capability drift like this is dangerous because it can silently expand authority, routing, and task orchestration behavior beyond what a user or reviewer expects, increasing the chance of unauthorized delegation, data spread, and misuse of the host agent.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The file grants broad powers such as direct tasking of any agent, cross-department coordination, performance evaluation, criticism flows, and replacement logic, none of which are necessary for memory management. If consumed by an agent platform, these instructions could normalize excessive control surfaces and permit unintended spawning, delegation, or governance behaviors that expand operational reach and complicate security review.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: `unlink_entity_from_concept` reports success without actually deleting the backing `instance_of` relation from the relation store. This creates inconsistent graph state where authorization, search, or inference logic that relies on relations will continue to treat the entity as linked even after callers believe it was removed, which can lead to stale associations, policy bypass, or data integrity failures in a memory/knowledge system.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The code imports and uses an LLM client, indicating dependence on an external model service that is not disclosed in the skill metadata. In a memory-management skill, this matters because users may reasonably expect local maintenance of stored memories, not transmission to a third party, creating an undisclosed data-exposure risk.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: This code builds a prompt from entity titles, content, and tags and sends it to `self.llm_client.call(messages)`, which can exfiltrate potentially sensitive stored memory to an external service. Because the feature processes accumulated episodic memory, the transmitted data may include secrets, personal data, internal decisions, or other context far beyond what users expect from 'memory optimization'.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The script sends parsed agent session content to an external LLM service for extraction. Those sessions can contain sensitive user prompts, secrets, internal reasoning artifacts, or proprietary data, and this exfiltration behavior is materially broader than simple local memory maintenance. In a memory tool, this is especially risky because the data source is historical conversations, which are often the most privacy-sensitive corpus available.

Context-Inappropriate Capability

Low

Confidence: 82% confidence
Finding: The loader imports every key from the workspace .env into process environment without allowlisting or validation. This broadens the script's access to unrelated credentials and configuration, increasing the blast radius if the process, dependencies, or logging paths are compromised.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The confidence mapping is inverted at the lowest end: very low numeric confidence is labeled 'verified' instead of a weak-confidence state. That can cause unreliable or speculative findings to be stored and later trusted as highly certain knowledge, poisoning downstream decisions or automation.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The same inverted mapping exists for lessons learned, causing low-confidence extracted lessons to be marked 'verified'. In a memory/knowledge system, incorrect certainty labels can entrench false guidance and make future agent behavior less reliable or actively harmful.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The module reads an arbitrary workspace .env file and injects every parsed key/value into os.environ at import time, which creates global side effects before the caller opts in. In a skill that may run inside larger agent processes, this can alter authentication, networking, subprocess, or library behavior unexpectedly and lets anyone who can modify the workspace .env influence runtime configuration beyond the ontology feature set.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The code comments claim KG_DIR is constrained to the workspace root, but ALLOW_ANY_KG_DIR disables that protection entirely. If an attacker can influence environment variables or the loaded .env file, they can redirect ontology reads/writes outside the workspace to arbitrary filesystem locations, enabling unintended file access or corruption.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The script builds a persistent user preference profiling mechanism from session-derived entities and inferred similarity patterns, creating new Preference entities with confidence, provenance, and retention-related fields. In a memory-optimization skill, this materially increases privacy risk because it transforms conversational/task history into durable behavioral profiles without any visible consent gate, minimization rule, or scope restriction.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The script scans agent session history under ~/.openclaw/agents/main/sessions and derives skill-usage analytics from unrelated transcripts, which expands its data access beyond simple memory maintenance. In a memory-optimization context this creates unnecessary surveillance of historical interactions and can expose sensitive commands or behavioral metadata without clear user consent or scope limitation.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: This code reads JSONL session transcripts and inspects toolCall command arguments to infer which skills were used. Even if only counts are persisted, the access pattern touches unrelated session data and commands, which may contain sensitive operational details, making the feature an unjustified privacy-invasive capability for the stated skill purpose.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: Level 3 compression is labeled and stored as session-specific memory, but it does not use the session content at all; it pulls from the global knowledge graph and attaches the result to the provided session_id. This breaks data isolation and can cause a session artifact to contain unrelated facts from other sessions or users, creating confidentiality and integrity issues in a memory system intended to reconstruct per-session context.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: Because Level 3 stores global KG facts under an arbitrary session_id, the recover path can later return those facts as if they belong to that session. This can disclose unrelated internal knowledge to someone recovering a session and is particularly risky in an agent memory tool where users expect session-local context reconstruction, not cross-session leakage.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal