Agent Memory System v12

Security checks across malware telemetry and agentic risk

Overview

This is a legitimate-looking agent memory system, but it handles highly sensitive long-term memory, profiling, sync, and external AI calls with several under-enforced safety controls.

Install only if you are comfortable with a high-trust memory layer. Use local-only embeddings/LLMs where possible, keep provider API keys out of shared shells and logs, disable profiling unless users explicitly consent, avoid federation/sync until peers and tenants are tightly scoped, and verify deletion/export/backup behavior before storing sensitive or regulated data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (184)

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The file expands a memory-system skill into broad multimodal processing through third-party LLM APIs, which materially changes the skill's effective data-handling scope. That mismatch can mislead users about what data may leave the local environment and increases the chance that sensitive media or memory content is sent to external services without clear expectation.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Automatic detection and fallback across multiple cloud model providers broadens outbound data flow beyond what a user may expect from a memory feature. If any provider credentials are present, the system may implicitly route prompts or media to external services, increasing the risk of unintended disclosure and weakening user control over data residency.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The security claims state private memories are protected, but the `recall` method does not enforce the `include_private` control itself and instead fully trusts `store.query_agent_memories` to apply authorization correctly. In a shared-memory system handling sensitive agent data, any mistake or bypass in the store layer could expose private memories across agents, creating a confidentiality boundary failure.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The agent profile endpoint returns aggregated statistics for an arbitrary agent_id without any visible authorization check tying the requested profile to the caller. In this file, access control appears to rely on middleware, but the handler itself does not enforce per-agent ownership, which can enable authenticated users to enumerate or inspect other agents' metadata and behavioral summaries.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: Automatically registering a new agent on a GET profile lookup allows state-changing identity creation through a read-like endpoint. An attacker can create arbitrary agent records via unaudited enumeration, polluting identity data and potentially preparing for later authorization or data-isolation abuses if other components trust registered agents.

Context-Inappropriate Capability

High

Confidence: 90% confidence
Finding: Personality profiling from chat transcripts creates a high-risk sensitive-data processing pipeline that can infer intimate attributes from user communications. Even with authenticated access, this materially increases privacy harm, secondary-use risk, and regulatory exposure beyond what users would reasonably expect from a memory service.

Context-Inappropriate Capability

Medium

Confidence: 77% confidence
Finding: Federation and distributed sync expand the trust boundary from a local memory API to cross-node or cross-agent coordination, increasing the chance of unintended data propagation and abuse. In a memory-system context, these features are more dangerous because they can multiply the blast radius of any authorization or isolation mistake.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The readiness probe references an undefined object, which can cause failures or incorrect readiness reporting. While this is primarily an availability and operational integrity issue rather than a confidentiality breach, broken readiness checks can lead to unhealthy instances receiving traffic or repeated restart loops.

Context-Inappropriate Capability

Medium

Confidence: 78% confidence
Finding: The restore path will download a backup archive from remote storage and then unpack and restore it into the local database if a local copy is absent. This creates an external data-import channel into the memory database without authenticity verification, so a compromised bucket, misconfiguration, or malicious object could replace local state with attacker-controlled data.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code sends raw stored memory content to an arbitrary `llm_fn` callback for causal analysis. Because this skill is a memory system and memories can contain sensitive personal or confidential data, forwarding full text to an external model/provider without strict consent, minimization, or locality guarantees creates a real data exfiltration/privacy risk.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: Batch contradiction verification similarly transmits stored memory text to an LLM for semantic judgment. This expands model-mediated access to historical memory contents, potentially including large volumes of sensitive data, without demonstrated necessity or clear boundary controls.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The init flow prints raw environment variable values directly to stdout while showing configuration status. Environment variables commonly hold API keys, tokens, endpoints, or personally identifying configuration, so this can leak secrets into terminals, logs, screenshots, shell history capture tools, or CI output.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: This repeats the same sensitive behavior during first-run setup, disclosing current environment variable contents to the console without redaction. Repetition increases exposure likelihood because initialization is a common workflow and often run in shared terminals, recorded demos, or automated environments.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The CLI helper automatically starts a background model server process based only on environment/config state, which creates side effects from a utility function that callers may not expect. In a security-sensitive or constrained environment, unexpected process spawning can bypass operator intent, increase attack surface, and lead to unauthorized local service exposure or resource consumption.

Context-Inappropriate Capability

High

Confidence: 91% confidence
Finding: This file includes direct tenant suspension and reactivation primitives with no visible authorization, audit enforcement, or caller validation. In a memory-focused skill, embedding account-control actions increases blast radius: if these methods are exposed through an agent tool surface or reachable by untrusted workflows, an attacker or prompt-induced misuse could disable service for arbitrary tenants, causing denial of service and administrative abuse.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: This module performs user profiling and simulated-response generation from stored memories, which materially expands functionality from memory storage/retrieval into behavioral inference. In a memory-system context, this is dangerous because it enables sensitive personality and preference profiling that users may not reasonably expect, increasing privacy, consent, and misuse risk even without classic code execution issues.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The docstring explicitly states personality analysis should only run with explicit user consent via AGENT_MEMORY_PERSONALITY_ANALYSIS_ENABLED, but no such enforcement exists in profiling paths like build_unified_profile, build_user_profile, or simulate_response. This creates a direct privacy-control bypass: operators may believe consent gating exists while the code silently performs profiling anyway.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The purge_by_source routine claims to remove all artifacts derived from a source memory, but it queries a source_ids column on tables that do not define that column. This causes purge failures to be silently swallowed, leaving derived entities, relations, and encyclopedia content behind and undermining deletion/compliance guarantees for sensitive data.

Description-Behavior Mismatch

Medium

Confidence: 86% confidence
Finding: The module automatically performs external network access to Hugging Face or a mirror to resolve and download models, even though it is framed as a memory component. In a memory system, stored content may be sensitive, and unexpected network behavior increases supply-chain, privacy, and deployment-boundary risk, especially in air-gapped or regulated environments.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: This code can send memory content and queries to third-party embedding providers, which is a genuine data-exposure risk because memory stores often contain sensitive user data, secrets, or personal information. The skill context makes this more dangerous: an 'agent memory' component is likely to process long-lived, high-sensitivity content, so off-box transmission materially changes the trust boundary.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: This engine aggregates personality, style, emotion, cognitive traits, narrative, and values into a unified self-profile, which goes beyond a narrow memory retrieval role. In a memory system context, expanding into behavioral profiling increases collection and inference of sensitive user attributes, broadening privacy risk and the blast radius of misuse or overreach.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: The code infers personality and psychological traits from stored conversations using analyzers and fallback heuristics, including Big Five-like attributes and other behavioral signals. Inferring sensitive mental or behavioral characteristics from historical messages can expose highly sensitive profile data without necessity for core memory functions, creating substantial privacy and misuse risk if accessed, logged, or acted upon.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The policy advertises `require_confirmation` as a privacy control, but `_check_policy()` does not enforce any confirmation workflow and simply `pass`es when confirmation is required. In a cross-agent memory-sharing system, this can cause knowledge to be disclosed to peer agents without any actual approval step, undermining operator expectations and privacy controls.

Description-Behavior Mismatch

Medium

Confidence: 83% confidence
Finding: `resolve_conflict()` mutates local memory `quality_score` values based on federated conflict outcomes, which goes beyond passive federation/search and lets remote interactions influence local trust state. If peer data is inaccurate or adversarial, this can silently poison ranking and future recall behavior, degrading integrity of the memory system.

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: `batch_remember()` exposes a `skip_filter` parameter but ignores it and always calls `remember(..., skip_filter=True, skip_dedup=True)`. This creates a silent policy bypass for bulk ingestion, allowing unfiltered and duplicate content to be stored even when callers expect normal safeguards, which is especially risky in a memory system where stored content can later influence agent behavior.

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal