Semantic Memory

v1.0.0

OpenClaw Agent 中文长期记忆系统。jieba TF-IDF + 向量检索三轨混合，中文语义优先，支持多Agent记忆协同。触发词：向量数据库、记忆检索、长期记忆、语义搜索、vector search、memory retrieval

⭐ 0· 93·0 current·0 all-time

by@jackxc2026

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for jackxc2026/semantic-memory.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Semantic Memory" (jackxc2026/semantic-memory) from ClawHub.
Skill page: https://clawhub.ai/jackxc2026/semantic-memory
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install semantic-memory

ClawHub CLI

Package manager switcher

npx clawhub@latest install semantic-memory

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

✓

Purpose & Capability

Name/description (Chinese-focused hybrid TF-IDF + vector memory) match the included code: vector_search.py, import_memory.py and a ChromaDB start script implement exactly that behavior. The dependencies (chromadb, jieba) and local cache/TF-IDF index are consistent with the described purpose.

Instruction Scope

Runtime instructions tell users to run a ChromaDB HTTP server (defaults to --host 0.0.0.0) and to import files from a filesystem directory. import_memory.py can recursively read and upload any .md files from a specified path (user-supplied or default), which could cause accidental ingestion of sensitive local data if misused. The start script and README default to binding the DB to 0.0.0.0 (network-exposed) without providing an example of securing it; that broad network exposure is a security concern.

✓

Install Mechanism

No install spec (instruction-only) and included scripts are plain Python/bash files. Nothing is downloaded from unknown URLs or executed from remote sources during install. This limits supply-chain risk, though running the provided commands will write log/cache files to disk.

ℹ

Credentials

The skill does not request secrets or credentials and uses a small set of environment variables (CHROMA_HOST/CHROMA_PORT/CHROMA_PATH/TFIDF_CACHE) which are reasonable for configuring a local DB. However, defaults (host=0.0.0.0, port=8000, path=./vector_db, cache dir ./tfidf_cache) are permissive and can expose data if not adjusted. No unexplained credentials or external endpoints are requested.

ℹ

Persistence & Privilege

always is false and the skill does not attempt to modify other skills or global agent config. It will create local cache files and logs (TF-IDF pickle cache, chroma_server.log). A notable persistence risk: TF-IDF cache is serialized with Python pickle and later unpickled; if an attacker can overwrite the cache file, unpickling could execute arbitrary code.

What to consider before installing

This skill implements the described memory/search functionality, but take these precautions before installing/running: 1) Do not run the ChromaDB server bound to 0.0.0.0 on a machine you don't trust or that is network-accessible; prefer localhost (127.0.0.1) or enable authentication/proxying. 2) Be careful with import_memory.py: if you pass a directory path, it will recursively read and upload .md files — avoid pointing it at system, home, or other sensitive directories. 3) The TF-IDF cache uses Python pickle; treat the cache directory as sensitive and ensure it's not writable by untrusted users/processes (an attacker-modified pickle could lead to code execution on load). 4) Review and, if needed, harden defaults in start_chroma.sh and README (host, port, API auth) before use. 5) Only run this skill and its scripts if you trust the source; if unsure, request the author to change defaults to bind to localhost and to use a safer cache format (e.g., JSON) or validate pickle integrity.

Like a lobster shell, security has layers — review code before you run it.

latestvk97aa8p4phv3rghc3v7jf2gxrd84pr25

93downloads

0stars

1versions

Updated 2w ago

v1.0.0

MIT-0

Semantic Memory — 中文语义记忆系统

OpenClaw Agent 中文长期记忆基础设施 | v1.0.0

技能概述

为 OpenClaw Agent 打造的中文长期记忆检索系统。三大核心创新：

中文语义优先：jieba TF-IDF 替代纯向量检索，中文理解大幅提升
混合三轨：TF-IDF × 向量 × 关键词加权，实测 100% 命中率
Agent 自动路由：自动识别意图路由到对应 memory collection

核心文件

文件	用途
`scripts/vector_search.py`	⭐ 核心检索脚本
`scripts/import_memory.py`	⭐ 记忆导入脚本
`scripts/start_chroma.sh`	ChromaDB 服务启动脚本
`README.md`	完整项目文档

快速开始

1. 安装依赖

pip install chromadb jieba

2. 启动 ChromaDB

chroma run --path ./vector_db --host 0.0.0.0 --port 8000 &

3. 导入记忆

python3 scripts/import_memory.py

4. 检索

python3 scripts/vector_search.py "你的查询"

工作流程

用户查询
    │
    ▼
Agent 自动路由（关键词匹配 collection）
    │
    ▼
TF-IDF 预计算索引（jieba 分词）
    ├─→ 中文语义相似度（主要）
    │
ChromaDB 向量检索
    ├─→ 语义扩展（补充）
    │
关键词命中加权（source 标题匹配）
    │
    ▼
综合评分 = 0.45×向量 + 0.55×TF-IDF + boost
    │
    ▼
输出 Top 6 结果

API 用法

import sys
sys.path.insert(0, 'scripts')
from vector_search import search

results = search("跌倒检测老人", topk=6)
for r in results:
    print(r['source'], r['combined'], r['doc'][:100])

配置

Agent 路由规则

修改 scripts/vector_search.py 中的：

AGENT_KEYWORDS = {
    '你的Agent': ['关键词1', '关键词2'],
}
AGENT_COLLECTION = {'你的Agent': 'projects'}

权重调整

combined = 0.45 * vec_sim + 0.55 * tfidf_norm + boost
# 调高 0.55 → 更注重中文关键词精确匹配
# 调高 0.45 → 更注重语义扩展

性能基准

指标	数值
中文查询命中率	100%（10/10）
平均响应速度	0.8 秒/次
支持中文	✅ jieba 分词
多 Agent 支持	✅ 自动路由
无 Docker/GPU	✅ 纯 pip

技术栈

ChromaDB 1.0（向量数据库）
jieba 0.42（中文分词）
Python 3.10+

已知限制

embedding 模型为英文（all-MiniLM-L6-v2），中文语义主要靠 TF-IDF 弥补
ChromaDB 跨机器文件共享需配置 API 认证
缓存基于文件路径，Windows 兼容性未测试

License

MIT — 署名即可，欢迎使用和二次开发。

Comments

Loading comments...