Memory Semantic Search
v1.0.0Semantic search over workspace markdown files using embedding API + SQLite vector store. Use when: (1) searching workspace notes/memory by meaning rather tha...
Security Scan
Capability signals
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (semantic search over workspace Markdown) matches the included code and instructions: index.py scans .md files, chunks them, calls an OpenAI-compatible embeddings endpoint, and stores vectors in SQLite for search. However, the registry metadata lists no required environment variables while the SKILL.md and scripts clearly expect EMBEDDING_API_KEY (and optionally EMBEDDING_API_BASE / EMBEDDING_MODEL). That omission in the metadata is an inconsistency.
Instruction Scope
SKILL.md and the scripts limit actions to scanning .md files in a provided workspace, chunking, calling an embeddings API, storing embeddings in a local SQLite DB, and performing local cosine-similarity search. There are no instructions to read unrelated system files or other credentials. NOTE: the runtime does transmit Markdown content to the configured embedding API endpoint, which is expected for this purpose but important to be aware of.
Install Mechanism
This is an instruction-only skill with shipped Python scripts and no install spec; nothing is downloaded at install time. That minimizes install-time risk. The code uses only Python stdlib and will be run locally.
Credentials
The scripts require EMBEDDING_API_KEY (and optionally EMBEDDING_API_BASE and EMBEDDING_MODEL). The registry metadata claims no required env vars — that is inconsistent. Also, by design the skill sends full Markdown chunks to the embedding API (default EMBEDDING_API_BASE is https://api.openai.com/v1). Sending sensitive notes to an external provider can expose data (some embedding providers log/retain inputs). The requested credential (API key) is proportional to the feature, but the lack of declared required env vars in the registry and the default external endpoint raise privacy/visibility concerns.
Persistence & Privilege
The skill does not request elevated privileges and always=false. It writes a SQLite DB file by default to the skill parent directory (memory_search.sqlite) unless a custom --db is provided; this is normal but the user should be aware of where indexed content is stored. It does not modify other skills or system-wide configs.
What to consider before installing
This skill appears to implement a legitimate local Markdown semantic-search tool, but take these precautions before installing/using it:
- Expect to provide an embeddings API key (EMBEDDING_API_KEY); the registry metadata omitted this — verify environment requirements before trusting the package.
- By default it will send the full text of your Markdown chunks to the configured embedding endpoint (default: https://api.openai.com/v1). Only use a provider you trust, or configure a self-hosted/enterprise-compatible embedding endpoint if your notes are sensitive.
- Avoid indexing secrets or credentials. Use a workspace path that excludes sensitive files, or add exclusions before running index.py.
- Consider setting --db to a controlled path (not a global skill directory) and protect that SQLite file appropriately.
- Review/verify EMBEDDING_API_BASE if you need embeddings to stay in your environment (e.g., Ollama, internal proxy). If you need privacy guarantees, confirm the embedding provider’s retention policy.
- The main technical inconsistency is the missing declared env vars in the registry; if this skill will run in an automated agent environment, confirm the platform will surface the required API key prompt before the skill runs.
If you want me to, I can: (1) point out the exact lines that transmit data to the network, (2) suggest a small patch to redact or exclude sensitive files before indexing, or (3) show how to change the default DB path and embedding base in the code.Like a lobster shell, security has layers — review code before you run it.
latest
Memory Semantic Search
Standalone semantic search over workspace .md files. Uses an OpenAI-compatible embedding API and SQLite for vector storage. No external dependencies beyond Python 3 stdlib + the embedding API.
Setup
Set these environment variables (or pass as CLI args):
export EMBEDDING_API_KEY="sk-xxx"
export EMBEDDING_API_BASE="https://api.openai.com/v1" # any OpenAI-compatible endpoint
export EMBEDDING_MODEL="text-embedding-3-small" # optional, this is the default
Usage
Index workspace
python3 scripts/index.py /path/to/workspace
Options:
--force— full reindex (clear existing data)--db PATH— custom SQLite path (default:memory_search.sqlitein skill dir)--api-base,--api-key,--model— override env vars
Incremental: only new/changed chunks are embedded. Deleted files are cleaned up automatically.
Search
python3 scripts/search.py "your query here"
Options:
--top-k N— number of results (default: 5)--min-score F— minimum cosine similarity threshold (default: 0.3)--json— output as JSON--db,--api-base,--api-key,--model— same as index
Typical agent workflow
- Run
index.pyon the workspace (once, or after file changes) - Run
search.py "query"to find relevant snippets - Use
readtool to load full context from the returned file paths and line numbers
Comments
Loading comments...
