Knowledge Base SOP

Dev Tools

Knowledge base management SOP for ingesting, archiving, linting, and maintaining a personal wiki knowledge base. Use when processing raw documents, bookmarks, or web content into structured Markdown wiki pages, performing periodic quality audits (dead links, orphan pages, trust validation), or managing a raw/wiki/outputs directory structure. Triggers on new raw content intake, bookmark pipeline runs, and knowledge base maintenance tasks.

Install

openclaw skills install knowledge-base-sop

Knowledge Base SOP

Directory Structure

  • raw/ — Original, unprocessed materials (web scrapes, bookmark exports, PDFs). Never modify directly.
  • wiki/ — Cleaned, structured core knowledge base (Markdown).
  • outputs/ — Temporary reports, drafts, or exported files.

1. Ingest & Compile

Trigger: New files appear in raw/.

  • Extract core points, entities (names/projects/tech stacks), and logical relationships.
  • Split long text into reasonable paragraphs of structured Markdown (H2/H3 headings, bold key terms).
  • Noise rule: Discard content under 300 words or with no substantive technical/knowledge value. Log the discard.
  • Hallucination guard: Strictly summarize from source. Never fabricate data or conclusions not in the original.

2. Archive

  • File naming: YYYY-MM-DD-core-topic-source.md
  • Obsidian wikilinks: Use [[Concept Name]] for cross-references in wiki pages.
  • Dedup: Before writing, search wiki/ for titles with >90% similarity. If found, append new content as a "supplemental update" at the bottom with a timestamp.
  • See references/archive-rules.md for detailed dedup logic.

3. Bookmark Pipeline

Trigger: raw/bookmarks.html updated.

  1. Parse HTML, extract all <a> tags (URL, title, meta description).
  2. Headlessly visit each URL to fetch body content (strip ads, nav bars).
  3. Summarize fetched content (≤300 chars), extract 5 keywords.
  4. Generate Markdown cards in wiki/Bookmarks/.
  5. Mark processed entries in the original HTML to prevent re-crawl.

4. Lint & Cleanup

  • Trust levels: All newly ingested wiki pages carry a [[待验证]] tag. Remove only after user confirmation.
  • Dead link detection: Weekly scan of all URLs in wiki/. On HTTP 404 or timeout, add {{dead_link}} marker at page top.
  • Orphan cleanup: Monthly check for pages not linked by any other page. Move to outputs/orphans/.

5. Interaction Constraints

  • Maintain a professional, precise communication style.
  • Before destructive operations (e.g., bulk deletion), present a plan and wait for confirmation.
  • Prioritize local tools. Minimize unnecessary follow-up questions.