Private Knowledge Base
Store, search, and summarize concepts across your PDFs and papers with fast semantic search and cross-document Q&A.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 8 · 0 current installs · 0 all-time installs
bywirec@WIREC-yzx
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description match the included scripts and schema: ingestion, text extraction, simple search, and summarization workflows are implemented by the shell scripts and index schema. No unrelated credentials, binaries, or services are requested.
Instruction Scope
Scripts only read user-supplied PDF files and write extracted text, embeddings folder, and index JSON under KB_ROOT (default ~/kb). Two noteworthy items: (1) metadata stores the full source path in index JSON (may reveal filesystem layout or sensitive path names), and (2) summarize.sh prints a suggested command using 'ollama run qwen3.5' — that is an external model invocation the README suggests but is not enforced by the scripts. Otherwise instructions are scoped to the stated purpose.
Install Mechanism
No install spec — instruction-only with local shell scripts. Scripts rely on common local tools (pdftotext, python3, pypdf) but do not download or execute remote code. This is low-risk relative to other install types.
Credentials
No required environment variables or credentials are declared. An optional KB_ROOT env var is used to choose storage location, which is proportionate. No other secrets or unrelated env vars are requested.
Persistence & Privilege
always:false and user-invocable default. The skill does not request permanent system-wide presence, does not modify other skills' config, and only writes files under the configured KB_ROOT.
Assessment
This skill appears to do what it says and works only on local files, but consider the following before installing or running it:
- KB metadata stores the absolute source path in index JSON — if you later upload the KB or share it, that may reveal filesystem layout or sensitive directory names. Consider setting KB_ROOT to a dedicated directory and reviewing index JSON files before sharing.
- The scripts will read any file you pass to them; only give them PDFs you trust. They write extracted text under KB_ROOT/docs and metadata under KB_ROOT/index.
- summarize.sh suggests using 'ollama run qwen3.5' (a local model runtime). That step is optional and not enforced by the scripts; if you run it, verify your ollama setup and understand whether that model is local or configured to call an external service.
- The scripts rely on pdftotext or python (pypdf). Installing those packages may be required; install them from well-known sources.
- If you plan to back up or share the KB, review contents for sensitive information (full paths, PII in extracted text) first.
Overall this is internally consistent and low-risk for local use, but be cautious about storing or sharing the generated index and text files.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Private Knowledge Base
Personal document storage and retrieval system for PDFs, papers, and research documents.
Quick Start
Ingest Documents
# Add PDF to knowledge base
./scripts/ingest.sh ~/path/to/document.pdf
# Process entire folder
./scripts/ingest-folder.sh ~/papers/
Query Knowledge Base
# Search for concept across all documents
./scripts/search.sh "transformer architecture"
# Get summary of concept from relevant docs
./scripts/summarize.sh "attention mechanism"
Core Workflows
1. Document Ingestion
When user provides new PDFs or papers:
- Create document entry in
kb/index.json - Extract text and metadata
- Generate embeddings for semantic search
- Store in
kb/docs/with normalized name
2. Cross-Document Q&A
When user asks "which document mentions X?" or "summarize X from my docs":
- Search embeddings for relevant passages
- Retrieve source documents
- Synthesize answer across documents
- Cite sources with document names and page numbers
3. Concept Linking
Build associations between documents:
- Shared concepts
- Citation relationships
- Topic clusters
File Structure
private-knowledge-base/
├── SKILL.md
├── scripts/
│ ├── ingest.sh # Single document ingestion
│ ├── ingest-folder.sh # Batch ingestion
│ ├── search.sh # Semantic search
│ └── summarize.sh # Cross-document summary
├── references/
│ └── schema.md # KB index schema
└── kb/ # Created at runtime
├── index.json
├── embeddings/
└── docs/
Usage Examples
User: "我之前存的文档里,哪篇提到了 transformer?"
→ Run ./scripts/search.sh "transformer"
User: "总结一下我文档里关于 attention 的内容"
→ Run ./scripts/summarize.sh "attention"
User: "把这篇 PDF 加到知识库"
→ Run ./scripts/ingest.sh <pdf-path>
Configuration
Set knowledge base location:
export KB_ROOT=~/.openclaw/workspace/kb
Default: ~/kb if not set.
Files
6 totalSelect a file
Select a file to preview.
Comments
Loading comments…
