memory_baidu_embedding_db

Security checks across malware telemetry and agentic risk

Overview

This is a real semantic memory skill, but it overstates local-only privacy while sending memory text and queries to a Baidu embedding dependency and includes broad local maintenance/disable instructions.

Review before installing. Use it only if you are comfortable with memory content and search queries being sent to Baidu for embeddings, inspect or pin the external baidu-vector-db helper first, avoid storing secrets or sensitive personal data, and do not run the maintenance or disable commands unless you understand the /root/clawd and Clawdbot extension changes they make.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (40)

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The document states that all data is stored locally and that there is no risk of external data leakage, but the same guide explicitly depends on Baidu embedding APIs and network connectivity. This creates a misleading security claim that could cause operators to send sensitive content to a third-party service under false assumptions about data locality and exposure.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The README states that memories never leave the system, yet the documented design relies on Baidu's remote embedding API, which necessarily transmits memory/query text off-host. This creates a materially false security/privacy claim that can cause operators to store sensitive conversation data under the mistaken belief that it remains local.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The README claims 'Zero Data Leakage' and that all processing happens locally, but elsewhere requires Baidu API credentials and API calls for embeddings. This contradiction misrepresents the trust boundary and may lead users to process regulated or confidential data assuming no third-party exposure exists.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The document recommends verifying configuration by printing `BAIDU_API_STRING` and `BAIDU_SECRET_KEY` directly to the terminal. Exposing secrets on screen can leak them through screen sharing, terminal logging, scrollback capture, shoulder surfing, or audit tooling, and it also contradicts the document's own guidance not to output sensitive information.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: When credentials are missing, the script prints an export example that interpolates the live BAIDU_SECRET_KEY value into terminal output. If the variable is set incorrectly or logs are captured by CI, shell history, or support tooling, the secret can be exposed to unintended parties. In this skill context, handling real API credentials makes accidental disclosure more dangerous than a purely local config mistake.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: This script presents itself as a health check, but it also mutates system state by changing file permissions and creating directories later in execution. That mismatch is dangerous because operators may run a supposedly read-only diagnostic in privileged contexts and unintentionally alter filesystem state, which can mask problems, break expected controls, or create opportunities for misuse.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The script concludes the system is 'safe to use' even though its verification process has side effects: it inserts persistent test data into the memory database. This can mislead operators into thinking the check is read-only and harmless, while it actually mutates application state and may pollute memory/search results.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The document makes an absolute safety claim that all data is stored locally and that there is no external data leakage risk, but earlier sections explicitly require Baidu embedding services and API connectivity. This mismatch can mislead users into sending sensitive content to a third-party service under false privacy assumptions, creating real confidentiality and compliance risk.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The configuration states `no_external_upload: true`, but it also enables a Baidu embedding component that necessarily sends content to an external service to generate embeddings. This creates a misleading security posture and can cause sensitive memory contents to be transmitted off-host despite the policy claiming otherwise.

Intent-Code Divergence

Medium

Confidence: 87% confidence
Finding: The script’s description understates its behavior: beyond 'loading' a memory system, it persists data to a dated file under /root/clawd/memory and executes an external bootstrap script. In an agent-skill context, hidden persistence and implicit execution are security-relevant because they create side effects and trust boundaries not disclosed by the stated purpose, increasing the risk of covert data retention or unintended code execution.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The script prints memory search results and explicitly echoes stored memory content to stdout, then claims there is 'zero data leakage risk.' That overstates security guarantees and can mislead operators into treating logs or console output as harmless even when they may contain sensitive user memories or metadata.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The API reference requires Baidu credentials and describes semantic memory storage, which strongly implies user-provided content is transmitted to an external embedding service. The documentation does not warn developers that memory content may leave the local system, creating a real privacy and compliance risk if sensitive user data is stored without notice, consent, or minimization.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The examples normalize storing personal preference data about a user without any accompanying caution about sensitivity, consent, or downstream transmission to third parties for embedding. In a memory skill, examples heavily influence implementation behavior, so this omission can lead developers to collect and persist personal data in ways that users do not expect.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The complete example stores professional background, communication preferences, and lifestyle habits—personal profile data that could meaningfully identify or characterize a user—without any privacy disclaimer or consent guidance. Because this is presented as a canonical usage pattern for a memory system backed by external embeddings, it increases the likelihood of widespread unsafe handling of personal data.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The guide describes memory add/search/update/delete operations in a system that also uses an external embedding service, but it does not warn that stored or queried content may contain sensitive information or that deletion may not fully remove derived artifacts such as cached vectors, notes history, or remote provider logs. This omission can lead users to mishandle confidential data and overestimate the effect of deletion operations.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The quickstart explicitly demonstrates storing and semantically searching user-related memories, including preferences, conversation content, and identifiers, but provides no privacy notice, retention guidance, consent requirements, or data-minimization advice. In a memory skill, this omission is security-relevant because it encourages collection of potentially sensitive personal data and normalization of persistence/search over user data without safeguards.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: Because the skill sends memory/query text to Baidu's embedding API, omitting a clear warning about remote processing is a privacy and transparency failure. Users may unknowingly submit personal, proprietary, or regulated data to an external provider based on the README's contrary assurances.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The Security section explicitly says memories never leave the system despite the design's dependence on a remote embedding service. Presenting an external-processing system as fully local increases the likelihood of unsafe deployment decisions and inappropriate handling of confidential data.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document includes a privileged shell command that renames a system extension path under /root without any warning, confirmation step, rollback guidance, or validation checks. In an agent-skill context, operational instructions like this are dangerous because they can lead to unauthorized or accidental service disruption, especially if executed automatically or copied blindly by an operator.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The markdown instructs users to export API credentials directly in the shell without warning about secret exposure risks such as shell history leakage, process environment disclosure, accidental logging, or reuse in insecure contexts. In a security-sensitive skill, normalized handling of raw secrets increases the chance of credential compromise and downstream abuse of the external API account.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The guide instructs users to display live API credentials for verification without noting the exposure risk. Even if this is intended as troubleshooting guidance, it can unnecessarily disclose credentials in terminals, recordings, remote sessions, support transcripts, or operational logging contexts.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The documentation states that memories never leave the system, but the skill also requires Baidu API calls to generate embeddings. In practice, memory content or user queries may be transmitted to Baidu, so this claim is misleading and can cause users to unknowingly expose sensitive conversation data to a third party.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The tutorial explicitly instructs developers to store user facts and conversation-derived preferences, including personal details and habits, but does not warn about consent, retention limits, access control, or privacy obligations. In a memory system for assistants/chatbots, this omission can lead to unnecessary collection and long-term storage of sensitive personal data, increasing privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The tutorial tells users to append API credentials to shell startup files, which creates persistent plaintext credential storage in a commonly read and backed-up location. While not malware, this increases exposure through local compromise, accidental sharing of dotfiles, terminal history/workstation backups, or multi-user systems.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The guide promotes a memory system backed by Baidu Embedding API calls but does not clearly disclose that stored or queried memory content may be sent to an external third-party service for embedding generation. In a memory/agent context, that content can include sensitive user data, so lack of an explicit data-transmission warning can lead to unintentional privacy exposure and policy/compliance violations.

VirusTotal

48/48 vendors flagged this skill as clean.

View on VirusTotal