Security audit

Smart Memory

Security checks across malware telemetry and agentic risk

Overview

Smart Memory appears to be a real local memory skill, but it needs Review because it persistently stores conversation history and uses broad local execution/install behavior including trusted remote model code.

Install only if you want a local service that can retain and later surface conversation history. Review where the SQLite database and hot-memory files are stored, avoid putting secrets in persisted chats, keep the server bound to localhost, and be aware that setup downloads packages and the default embedding model enables trusted remote code execution unless changed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (44)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: for path in ["./smart-memory", "../smart-memory", "./skills/smart-memory"]: venv_activate = Path(path) / ".venv/bin/activate" if venv_activate.exists(): subprocess.Popen( f"cd {path} && . .venv/bin/activate && python -m uvicorn server:app --host 127.0.0.1 --port 8000 > /tmp/smart-memory-server.log 2>&1 &", shell=True, )
Confidence: 93% confidence
Finding: subprocess.Popen( f"cd {path} && . .venv/bin/activate && python -m uvicorn server:app --host 127.0.0.1 --port 8000 > /tmp/smart-memory-server.log 2>&1 &", s

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The embedder defaults to trust_remote_code=True when loading a Hugging Face/SentenceTransformer model, which can permit execution of repository-provided Python code during model initialization. In a local-memory component, this unnecessarily expands the attack surface: a compromised model repo, poisoned dependency path, or unexpected model override could lead to arbitrary code execution in the agent environment.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The helper silently attempts to start a local memory server when the health check fails, which is a security-relevant side effect for a session primer. In agent environments, automatic process spawning can run unreviewed local code from relative paths and surprise users or operators who expected a read-only initialization step.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The script automatically spawns a shell and launches a detached Python server from a relative directory search path, which creates hidden side effects and trusts local filesystem layout. In a hostile or multi-user environment, this can execute unintended code from a matching directory or start a background service without explicit operator awareness.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The script does more than read memory context: it automatically locates a local checkout, changes into that directory, and launches a long-lived FastAPI server in the background with nohup. Even though it binds to 127.0.0.1, this expands the trust boundary and creates code-execution and persistence behavior in a session-start hook, which is risky if an attacker can place or tamper with one of the searched smart-memory directories.

Description-Behavior Mismatch

Medium

Confidence: 80% confidence
Finding: The service is described as a local transcript-first memory API, but it exposes eval execution endpoints that expand its operational scope beyond simple memory serving. If the underlying evaluation runners load external suites, test cases, or plugins, an attacker could trigger unintended computation, data access, or internal functionality through an API surface that may not be expected or protected.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The postinstall script performs automatic network package installation, including PyTorch from an external index and additional Python dependencies from a requirements file. Even if intended to support the skill's Python backend, this expands installation-time behavior beyond simple local memory functionality and creates supply-chain and unexpected code-execution risk because package install hooks and downloaded artifacts execute with the user's privileges.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: A tool presented as a session-memory querying utility also auto-starts a background server process if the health check fails. In an agent-skill context, that expands behavior from passive querying to process creation and persistence, which can be abused or cause unintended execution in environments where users do not expect tools to spawn daemons.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The script accepts the server URL from --server or MEMORY_SERVER_URL and then sends agent identity, message content, projects, and questions to that endpoint over HTTP. This allows silent exfiltration of contextual data to any host if the parameter or environment is influenced by an attacker or misconfiguration, which is especially sensitive for a memory/priming tool.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The method accepts a session_id and reports the rebuild scope as session-scoped, but its implementation clears derived state globally and then replays all transcript messages via list_messages() with no session filtering. This mismatch can cause operators or higher-level components to believe only one session was rebuilt when in reality the entire memory state was destroyed and reconstructed, creating integrity, audit, and availability risks in a persistent memory system.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The rederive_memory_for_message API suggests targeted rederivation for a single message, but it calls rebuild_from_transcripts(session_id=session_id), which still performs a full clear and full replay of all transcripts. In this skill context, where transcript-backed memory is persistent and shared across sessions, a seemingly narrow operation can trigger unnecessary global state destruction and expensive rebuilds, increasing the chance of denial of service, inconsistent state, and misleading audit records.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill explicitly instructs the agent to retrieve prior context and inspect transcript-backed memory, including endpoints for memories, evidence, history, and transcripts, but provides no warning, consent boundary, or authorization guidance for handling stored personal or historical data. In a memory-focused skill, this creates a real privacy and data-minimization risk because an agent may over-collect or disclose prior user data simply by following the documented workflow.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The guide instructs hosts to send user and assistant messages, transcripts, and memory-derived content over HTTP endpoints without an explicit warning that conversational data is being transmitted to a separate local service. Even when the service is local, this creates a privacy and trust risk because integrators may silently forward sensitive user content, credentials, or proprietary data without disclosure, consent, or minimization.

Missing User Warnings

Medium

Confidence: 75% confidence
Finding: The README emphasizes immutable transcript logging, evidence retention, and rebuild/wipe semantics without clearly warning operators that conversation content may be durably stored and later replayed. In a memory system handling agent transcripts, this can lead to unintentional retention of sensitive data or destructive administrative actions being invoked without understanding data-loss and privacy consequences.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill prominently describes persistent transcript-first memory, typed long-term memory, and canonical local storage, but does not clearly warn users that conversations may be retained indefinitely and exposed through inspection features. This creates a privacy and compliance risk because users may disclose sensitive personal, credential, or business information without informed consent about retention and retrieval.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The code derives new 'belief' records directly from raw episodic/semantic memory whenever preference-like terms appear, and stores them as persistent synthetic memories with non-trivial confidence and reinforcement metadata. In a transcript-first persistent memory system, this can silently infer and retain sensitive user preferences or traits without consent, visibility, provenance safeguards, or validation, which increases privacy risk and can bias future agent behavior based on incorrect inferences.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The example explicitly sends both user and assistant messages to a persistence service but provides no notice, consent flow, or data-handling warning. In an agent integration guide, this can lead developers to implement silent transcript retention and transmission of potentially sensitive conversations, creating privacy and compliance risk even if the service is local by default.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The documentation lists memory and transcript inspection endpoints that can expose stored conversations, preferences, and evidence chains, but it does not warn about sensitivity or access-control expectations. This increases the chance that operators expose highly revealing debugging endpoints without authentication, role restrictions, or privacy safeguards.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: Remote code trust is enabled silently, with no warning to operators that model loading may execute untrusted code. That lack of disclosure increases the chance the skill is deployed in sensitive environments under the false assumption that loading an embedding model is data-only, making exploitation more likely if the upstream model supply chain is compromised.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The function sends session-start context, including agent identity, user message, and hot-memory state, to the memory server without any consent prompt or disclosure to the end user. Even though the default endpoint is localhost, this still transmits potentially sensitive context to another service and the CLI allows changing `server_url`, increasing the chance of accidental disclosure to a non-local endpoint.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: Launching the server through `bash -c` as a detached subprocess hides process creation from the user and reduces visibility into what is running in the background. Even though the command string is not directly user-controlled here, silent background execution is risky in an agent skill because it can persist processes and complicate auditing or shutdown.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The store persists agent hot-memory, reinforcement metadata, retrieval counts, and memory references to a local JSON file with no visible consent flow, disclosure, retention control, or protection at the point of write. In a transcript-first memory skill, this can silently retain sensitive conversational data and behavioral metadata on disk, increasing privacy and data exposure risk if the host is shared, compromised, or backed up to less secure locations.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: This pipeline persists raw transcript messages and associated metadata to local storage, then processes them for entity extraction, scoring, revision, and long-term memory creation without any visible consent gate, notice mechanism, or minimization control in this code path. Because the skill is explicitly a persistent memory system, it is likely to collect sensitive conversation content by design, making undisclosed retention and secondary processing a real privacy/security risk rather than a purely theoretical issue.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This code persists transcript-derived content, extracted entities, relations, emotional metadata, and evidence summaries into long-term memory without any visible consent, minimization, or sensitivity filtering controls in this component. In a transcript-first memory skill, that increases the risk of retaining secrets, personal data, or other sensitive conversation content beyond user expectations, enabling privacy harm if the memory store is later queried, exposed, or reused.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The installer unconditionally deletes the target installation directory with `rm -rf "$TARGET_DIR"` before reinstalling, with no confirmation, backup, or validation that the directory only contains safe-to-remove generated content. In a persistent workspace context, this can destroy prior memory data, local modifications, or unrelated files if the computed path is unexpected.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.dangerous_exec, suspicious.env_credential_access

Shell command execution detected (child_process).

Critical

Code: suspicious.dangerous_exec
Location: examples/session-start/nodejs-agent.js:49

Shell command execution detected (child_process).

Critical

Code: suspicious.dangerous_exec
Location: smart-memory/index.js:158

Shell command execution detected (child_process).

Critical

Code: suspicious.dangerous_exec
Location: smart-memory/postinstall.js:14

Environment variable access combined with network send.

Critical

Code: suspicious.env_credential_access
Location: smart-memory/index.js:11