CXM: Neural Memory for Agents

Security checks across malware telemetry and agentic risk

Overview

This skill mostly matches a local code-search purpose, but it also reads sensitive local context, persists indexes/prompts, has broad install dependencies, and can modify files with weak defaults.

Install only in an isolated environment, prefer the narrower pyproject dependencies over the supplied requirements file unless audited, and do not run ask, ctx, watch, or patching modes unless you accept local history/session collection and persistent storage. Configure .cxm.yaml with narrow allowed_write_paths and ask_first mode before allowing file modifications.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (63)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
console.print(f"📦 [cyan]Klone Repository:[/cyan] {repo_url}...")
        try:
            # depth 1 for performance, list for security
            subprocess.run(["git", "clone", "--depth", "1", repo_url, str(cache_dir)], check=True, capture_output=True, text=True)
            console.print(f"✓ Repository erfolgreich in den Cache geladen.")
        except Exception as e:
            console.print(f"[bold red]Fehler beim Klonen:[/bold red] {e}")
Confidence
80% confidence
Finding
subprocess.run(["git", "clone", "--depth", "1", repo_url, str(cache_dir)], check=True, capture_output=True, text=True)

Lp3

Medium
Category
MCP Least Privilege
Confidence
97% confidence
Finding
The skill documentation describes capabilities that include recursive file reads, shell command execution, network model downloads, and potential file patching, yet no explicit permissions are declared. This creates a transparency and governance gap: an agent or user may invoke a skill with broader access than expected, increasing the chance of unintended data exposure, unsafe writes, or network egress.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The documentation states that the CLI can clone or update GitHub repositories via `--github`, which expands the skill from local codebase understanding into remote content acquisition. That broader capability increases the attack surface and can cause users or agents to pull and index untrusted code contrary to the skill's stated scope.

Description-Behavior Mismatch

Medium
Confidence
97% confidence
Finding
The docs say `cxm ask` automatically gathers Git status, recent edits, and session history from Gemini CLI and Claude Code CLI. Automatic collection of external AI session data is privacy-sensitive and can expose secrets, prompts, code, or operational context beyond what users expect from a codebase-memory skill.

Description-Behavior Mismatch

Low
Confidence
88% confidence
Finding
Automatically copying generated prompts to the clipboard can leak sensitive repository or session-derived content into other applications, clipboard managers, or remote desktop tooling. While lower severity than direct exfiltration, it is an undocumented data handling behavior outside the expected scope.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
`cxm ctx` is documented as exposing system load, Git state, recent file activity, and external AI session snippets. This is a substantial privacy and information disclosure risk because it reveals operational state and potentially sensitive content unrelated to the narrow memory/search purpose advertised by the skill.

Context-Inappropriate Capability

Low
Confidence
90% confidence
Finding
The CLI automatically copies the enhanced prompt to the system clipboard, which is outside the stated core purpose of codebase understanding and semantic search. Clipboard contents are globally accessible to other local applications and may expose sensitive code, prompts, secrets, or internal context without explicit user consent.

Context-Inappropriate Capability

Medium
Confidence
78% confidence
Finding
The watch command starts a background watcher/daemon that continuously monitors and updates project context, expanding the tool from on-demand analysis into persistent surveillance-like behavior. In a skill meant for code understanding and retrieval, a long-running background process increases privacy, persistence, and unintended data collection risks, especially if users are not clearly informed.

Description-Behavior Mismatch

Medium
Confidence
88% confidence
Finding
The skill writes arbitrary project context to a persistent local JSON file, creating data retention beyond a transient analysis/search function. In an agent setting, this can unintentionally store sensitive repository content, prompts, secrets, or user-provided data on disk where it may later be exposed or reused outside the user's expectations.

Description-Behavior Mismatch

Medium
Confidence
97% confidence
Finding
The enhancer automatically gathers broad system context via `gather_all()`, and the docstring explicitly mentions git/file/shell context. That exceeds the skill's stated purpose of codebase understanding and semantic retrieval, creating unnecessary access to potentially sensitive local environment data that may then influence prompts or be exposed downstream.

Context-Inappropriate Capability

Medium
Confidence
96% confidence
Finding
Collection of shell context is particularly risky because shell state can contain command history, paths, environment details, tokens, and operational metadata unrelated to semantic code understanding. In this skill context, shell collection is not justified by the advertised architecture/search function, so it materially increases privacy and secret-exposure risk.

Description-Behavior Mismatch

High
Confidence
93% confidence
Finding
The module assembles a powerful code-generation prompt format with system instructions, patch-application syntax, and embedded context variables, which does not align with the declared skill purpose of codebase architecture analysis and semantic memory ingestion. In a skill ecosystem, this mismatch is dangerous because it can covertly repurpose the skill into a prompt-compilation/code-generation primitive that may exfiltrate context or induce unsafe downstream actions under the guise of a benign analysis tool.

Description-Behavior Mismatch

High
Confidence
95% confidence
Finding
This module enables parsing LLM output and writing arbitrary file contents to disk, which materially exceeds the stated skill purpose of architecture understanding, semantic search, and context ingestion. In an agent setting, this creates a dangerous execution path where untrusted model output can directly modify the workspace, enabling code injection, persistence, or sabotage.

Intent-Code Divergence

Medium
Confidence
97% confidence
Finding
The guardrail model is fail-open: when allowed_write_paths is unset or empty, is_path_allowed_for_write returns True and permits writing anywhere. That contradicts the claim of safe guardrails and can let an attacker or compromised LLM output overwrite arbitrary files in the current environment, especially dangerous in tools that ingest adversarial text and model-generated patches.

Intent-Code Divergence

Medium
Confidence
98% confidence
Finding
The class advertises automatic secret masking, but search() and get_document() call _get_content(), which rereads full file contents from disk and returns them unmasked. In a memory/indexing skill intended to ingest broad codebases and documentation, this creates a direct secret-exposure path that defeats the stated protection model and can leak credentials from indexed repositories.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The watcher component accepts a remote GitHub URL and clones arbitrary repositories before monitoring them, which expands its authority from local filesystem watching to network retrieval and external content ingestion. In the context of a memory/indexing skill, this increases attack surface by allowing untrusted remote content to be pulled into the workspace and indexed without clear trust boundaries, validation, or restriction.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
A filesystem watcher is expected to observe local changes, but this implementation also performs remote repository cloning, giving the component an unjustified network-capable side effect. That mismatch makes the skill more dangerous because an apparently local architecture-analysis tool can fetch attacker-controlled code or documents and feed them into downstream indexing and context systems.

Description-Behavior Mismatch

Medium
Confidence
93% confidence
Finding
The GUI directly invokes patcher.parse_and_apply(mock_output) from a button click, causing a file write action from an interface whose declared purpose is architecture/search/memory assistance. This expands the skill into code-modification behavior without clear trust boundaries, preview validation, or strong user confirmation, increasing the risk of unintended or deceptive code changes if the patch source later becomes dynamic.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
This workflow performs planning and secure-prompt assembly for refactoring/code-change orchestration, which is materially broader than the stated semantic search and context-memory role of the skill. That mismatch is security-relevant because users and higher-level policy may trust the skill as read-oriented while it prepares write-oriented actions that can later modify the codebase.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The implementation materially exceeds the declared skill purpose. A skill advertised for codebase understanding and semantic/context memory actually orchestrates code generation, audit gating, and file patch application, creating a capability mismatch that can mislead users and downstream systems into granting broader trust than warranted. In this context, the deceptive scope is dangerous because the skill can modify project files under the guise of read-oriented analysis functionality.

Context-Inappropriate Capability

High
Confidence
96% confidence
Finding
This code applies shadow-scaffolding patches to project files even though the skill's stated role is context memory and architecture understanding. That mismatch increases the chance that callers invoke the skill in environments where file modification was not expected or authorized, enabling unauthorized repository changes. The risk is amplified because the patch path and contents come from generated text processed by the patcher.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The orchestration generates code from an LLM prompt and then applies resulting file patches if an internal audit passes, despite the skill being presented as a context-memory/code-understanding tool. This is dangerous because generated output is treated as executable repository modifications, which can introduce backdoors, logic flaws, or destructive changes with only a weak programmatic gate. The surrounding skill context makes it more dangerous, not less, because users are less likely to expect or scrutinize write operations from a supposedly analytical skill.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The code harvests recent prompts from external AI CLI session state and inserts them into the assembled prompt based only on weak keyword overlap. Those prior prompts may contain sensitive user data, credentials, proprietary code requests, or unrelated task context, causing unintended cross-session/context disclosure to the model and to downstream consumers of the assembled prompt.

Description-Behavior Mismatch

Medium
Confidence
90% confidence
Finding
The exported skill schema expands the tool's advertised scope beyond the metadata description by exposing a high-level autonomous RAG/context harvesting capability and describing the package as providing broad RAG support. In an agent ecosystem, this kind of capability drift is security-relevant because planners and policy layers often rely on manifest-described scope to decide when a skill may be invoked; broader exported behavior can cause over-collection of repository data, unintended autonomy, or bypass of least-privilege assumptions.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The code reads ~/.bash_history and exposes recent commands, username, and shell details even though the skill is described as codebase architecture and semantic search tooling. Shell history often contains secrets, internal hostnames, database commands, tokens, and unrelated personal activity, so collecting it creates unnecessary sensitive-data exposure.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal