Private Knowledge Base

Store, search, and summarize concepts across your PDFs and papers with fast semantic search and cross-document Q&A.

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 8 · 0 current installs · 0 all-time installs

bywirec@WIREC-yzx

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the included scripts and schema: ingestion, text extraction, simple search, and summarization workflows are implemented by the shell scripts and index schema. No unrelated credentials, binaries, or services are requested.

ℹ

Instruction Scope

Scripts only read user-supplied PDF files and write extracted text, embeddings folder, and index JSON under KB_ROOT (default ~/kb). Two noteworthy items: (1) metadata stores the full source path in index JSON (may reveal filesystem layout or sensitive path names), and (2) summarize.sh prints a suggested command using 'ollama run qwen3.5' — that is an external model invocation the README suggests but is not enforced by the scripts. Otherwise instructions are scoped to the stated purpose.

✓

Install Mechanism

No install spec — instruction-only with local shell scripts. Scripts rely on common local tools (pdftotext, python3, pypdf) but do not download or execute remote code. This is low-risk relative to other install types.

✓

Credentials

No required environment variables or credentials are declared. An optional KB_ROOT env var is used to choose storage location, which is proportionate. No other secrets or unrelated env vars are requested.

✓

Persistence & Privilege

always:false and user-invocable default. The skill does not request permanent system-wide presence, does not modify other skills' config, and only writes files under the configured KB_ROOT.

Assessment

This skill appears to do what it says and works only on local files, but consider the following before installing or running it: - KB metadata stores the absolute source path in index JSON — if you later upload the KB or share it, that may reveal filesystem layout or sensitive directory names. Consider setting KB_ROOT to a dedicated directory and reviewing index JSON files before sharing. - The scripts will read any file you pass to them; only give them PDFs you trust. They write extracted text under KB_ROOT/docs and metadata under KB_ROOT/index. - summarize.sh suggests using 'ollama run qwen3.5' (a local model runtime). That step is optional and not enforced by the scripts; if you run it, verify your ollama setup and understand whether that model is local or configured to call an external service. - The scripts rely on pdftotext or python (pypdf). Installing those packages may be required; install them from well-known sources. - If you plan to back up or share the KB, review contents for sensitive information (full paths, PII in extracted text) first. Overall this is internally consistent and low-risk for local use, but be cautious about storing or sharing the generated index and text files.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk977d6zymdgzfpjfcskd02h6ts83aggh

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Private Knowledge Base

Personal document storage and retrieval system for PDFs, papers, and research documents.

Quick Start

Ingest Documents

# Add PDF to knowledge base
./scripts/ingest.sh ~/path/to/document.pdf

# Process entire folder
./scripts/ingest-folder.sh ~/papers/

Query Knowledge Base

# Search for concept across all documents
./scripts/search.sh "transformer architecture"

# Get summary of concept from relevant docs
./scripts/summarize.sh "attention mechanism"

Core Workflows

1. Document Ingestion

When user provides new PDFs or papers:

Create document entry in kb/index.json
Extract text and metadata
Generate embeddings for semantic search
Store in kb/docs/ with normalized name

2. Cross-Document Q&A

When user asks "which document mentions X?" or "summarize X from my docs":

Search embeddings for relevant passages
Retrieve source documents
Synthesize answer across documents
Cite sources with document names and page numbers

3. Concept Linking

Build associations between documents:

Shared concepts
Citation relationships
Topic clusters

File Structure

private-knowledge-base/
├── SKILL.md
├── scripts/
│   ├── ingest.sh          # Single document ingestion
│   ├── ingest-folder.sh   # Batch ingestion
│   ├── search.sh          # Semantic search
│   └── summarize.sh       # Cross-document summary
├── references/
│   └── schema.md          # KB index schema
└── kb/                    # Created at runtime
    ├── index.json
    ├── embeddings/
    └── docs/

Usage Examples

User: "我之前存的文档里，哪篇提到了 transformer?" → Run ./scripts/search.sh "transformer"

User: "总结一下我文档里关于 attention 的内容" → Run ./scripts/summarize.sh "attention"

User: "把这篇 PDF 加到知识库" → Run ./scripts/ingest.sh <pdf-path>

Configuration

Set knowledge base location:

export KB_ROOT=~/.openclaw/workspace/kb

Default: ~/kb if not set.

Files

6 total

Select a file

Select a file to preview.

Comments

Loading comments…