MNN Local Knowledge Base

v1.0.0

Local vector knowledge base with GraphRAG retrieval (vector + BM25 + knowledge graph). Use this skill when the user mentions: "查知识库", "加入知识库", "记住这个", "save...

0· 143·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for er6y/py-mnn-kb.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "MNN Local Knowledge Base" (er6y/py-mnn-kb) from ClawHub.
Skill page: https://clawhub.ai/er6y/py-mnn-kb
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install py-mnn-kb

ClawHub CLI

Package manager switcher

npx clawhub@latest install py-mnn-kb
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (local MNN KB, GraphRAG) match the delivered artifacts: a Python implementation, CLI, and instructions for building/querying a local KB. Required components (MNN embedding backend, text parsers, optional LLM client) are appropriate for the stated functionality.
Instruction Scope
SKILL.md/README instruct the agent to index local files, insert notes, and return retrieved context. The README and code also permit using an LLM for generation (configurable via config.json / --no-llm). This means retrieved private context may be sent to a remote OpenAI-compatible endpoint if you enable LLM answering — a normal feature for RAG but an important privacy consideration. There is a small inconsistency: SKILL.md explicitly states 'no LLM call is made inside this tool' for kb_query, yet other docs and config include an llm_api section and examples that perform LLM generation. Confirm desired behavior (use --no-llm if you want pure local retrieval).
Install Mechanism
No formal install spec in registry (instruction-only), but code auto-downloads an embedding model (~400 MB) from ModelScope (modelscope.cn) on first run via urllib.request. Downloading from ModelScope is expected for an embedding backend; this does write model files to disk. The skill asks you to pip install requirements.txt (standard).
Credentials
The skill does not require platform environment variables, but it expects a local config.json with an llm_api.api_key and base_url if you want LLM generation. Storing the API key in config.json (gitignored by the project) is consistent but means a secret is kept on disk. The presence of openai (or OpenAI-compatible) client is justified by the optional LLM step; however, enabling it will send KB context and user queries to whatever endpoint you configure. If you do not want outbound data leakage, use --no-llm or provide a local LLM endpoint you trust.
Persistence & Privilege
always:false and the skill does not request force-inclusion or system-wide config changes. It writes its own artifacts (assets/, knowledge_bases/, downloaded model) in its directory and updates the model's llm_config.json for embedding tokenization alignment — expected for this use case. It does not modify other skills or system-wide agent settings.
Assessment
This skill appears to do what it says: build and query a local MNN-based KB and (optionally) call an LLM for answers. Before installing, consider: 1) Model download: the first run auto-downloads a ~400MB embedding model from ModelScope (modelscope.cn) into the repo's assets directory — run this on a machine and network where you are comfortable downloading large binary files. 2) Secrets: the tool expects an API key in config.json for LLM generation; that file is written to disk (it is gitignored by the project). Do not put highly sensitive keys there unless you control the environment. 3) Data exfiltration: if you enable the LLM generation path (default examples show this), retrieved KB context and queries will be sent to the configured OpenAI-compatible endpoint. Use --no-llm or point to a trusted/local LLM if you need to keep KB content local. 4) Review code if you have high security needs: the model download and llm invocation are visible in scripts/py_mnn_kb.py (no obfuscated endpoints or hidden backdoors were detected). 5) Run in an isolated environment (virtualenv / container) and inspect config.json and the repo before giving it access to private documents.

Like a lobster shell, security has layers — review code before you run it.

knowledgevk9748m5rmrgnsmmxbey61wtgm183ha4wlatestvk9748m5rmrgnsmmxbey61wtgm183ha4wlocalvk9748m5rmrgnsmmxbey61wtgm183ha4wmnnvk9748m5rmrgnsmmxbey61wtgm183ha4wofflinevk9748m5rmrgnsmmxbey61wtgm183ha4wretrievalvk9748m5rmrgnsmmxbey61wtgm183ha4w
143downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

py_mnn_kb — MNN Knowledge Base Skill

Local GraphRAG knowledge base backed by SQLite + MNN embeddings. Fully compatible with Android OfflineAI RAG database format.


Setup

1. Install dependencies

pip install -r requirements.txt

2. Configure

cp config.example.json config.json
# Edit config.json: set llm_api.api_key, optionally change default_name

Key fields in config.json:

FieldDefaultDescription
knowledge_base.default_namedefaultKB used when --kb is omitted
knowledge_base.storage_dirassets/knowledge_basesWhere DB files are stored
llm_api.api_key(required for query+LLM)OpenAI-compatible API key
graph_ner.custom_dict_pathassets/example_terms.jsonDomain terminology for NER

3. First run (auto-downloads embedding model)

python scripts/py_mnn_kb.py status

On first use, Qwen3-Embedding-0.6B-MNN-int4 (~400 MB) is auto-downloaded into assets/.


Tools

kb_build — Build / append knowledge base from files

Indexes a directory of documents. Runs in background; returns immediately. Check progress with kb_status.

Parameters:

NameTypeRequiredDescription
dir_pathstringyesDirectory path to index (recursive)
kb_namestringnoKB name (default: value of default_name in config.json)

Returns: { status, command, kb_name, pid, files, message }

Supported formats: .txt .md .pdf .docx .pptx .xlsx .csv .html .json .jsonl

CLI:

python scripts/py_mnn_kb.py build ./my_docs/ --kb my_kb
python scripts/py_mnn_kb.py build ./my_docs/          # uses default KB name

Trigger phrases: "加入知识库", "索引这个目录", "build KB", "index these files"


kb_note — Insert a text note directly into the knowledge base

Embeds and stores a free-form text snippet. Synchronous. Refused while build is running.

Parameters:

NameTypeRequiredDescription
textstringyesText content to store
kb_namestringnoKB name (default: default_name)
titlestringnoOptional title, stored as source label

Returns: { status, kb_name, chunks_added, elapsed_sec }

CLI:

python scripts/py_mnn_kb.py note "Q1 roadmap: focus on modules A and B" --kb my_kb
python scripts/py_mnn_kb.py note "$(cat meeting.txt)" --kb my_kb --title "Weekly meeting"

Trigger phrases: "记住这个", "记录一下", "加个笔记", "save this", "remember this"


kb_query — Retrieve relevant chunks (RAG retrieval)

Runs vector + BM25 + GraphRAG fusion and returns the top-N context string. The agent appends this context to its prompt — no LLM call is made inside this tool. Synchronous. Refused while build is running.

Parameters:

NameTypeRequiredDescription
promptstringyesQuery question or keywords
kb_namestringnoKB name (default: default_name)

Returns: Multi-document context string, e.g.:

Document1 [ID:42 source:manual.pdf]:
Deployment has three steps...

Document2 [ID:55 source:notes.md]:
...

CLI:

python scripts/py_mnn_kb.py query "NAND筛选核心流程" --kb my_kb --no-llm
python scripts/py_mnn_kb.py --output json query "产品路线图" --kb my_kb

Agent usage pattern:

context = kb_query("用户的问题", kb_name="my_kb")
# Then: f"Based on the following context:\n{context}\n\nQuestion: {user_question}"

Trigger phrases: "查知识库", "查一下", "知识库里有没有", "search KB"


kb_status — Check build progress or last build result

No KB initialization needed. Always returns instantly.

Parameters:

NameTypeRequiredDescription
kb_namestringno(informational only, does not affect result)

Returns:

  • While building: { status: "building", progress: 0-100, message }
  • After success: { status: "ok", message, stats: { chunks_added, elapsed_sec, ... } }
  • After failure: { status: "error", error }
  • Not yet run: { status: "idle", message }

CLI:

python scripts/py_mnn_kb.py status

Trigger phrases: "构建进度", "build status", "知识库建好了吗"


Workflow Examples

A · User uploads files → auto-index

User: "把这些文档加入知识库"
Agent → save files to temp dir
      → kb_build(dir_path=tmp_dir, kb_name="my_kb")   # returns immediately
      → "已开始后台构建,用 kb_status 检查进度"

B · User dictates a note → insert

User: "记住:STAR2000 低温写性能提升 8%"
Agent → kb_note(text="STAR2000 低温写性能提升 8%", kb_name="my_kb", title="技术发现")
      → "已保存到知识库 my_kb"

C · User asks a question → KB-assisted answer

User: "NAND 筛选核心流程是什么?"
Agent → context = kb_query("NAND 筛选核心流程", kb_name="my_kb")
      → append context to LLM prompt → generate answer

D · Check if build finished before querying

Agent → st = kb_status()
      → if st["status"] == "building": tell user to wait
      → else: proceed with kb_query(...)

Notes

  • kb_build is incremental append — re-running on the same directory adds only new content
  • kb_note and kb_query are blocked (return status: building) while a build is running
  • --output json on any CLI command returns machine-parseable JSON on stdout
  • KB name default is used when --kb is omitted; configure default_name in config.json

Comments

Loading comments...