LiteRAG

v0.2.2

Local retrieval skill for large documentation corpora using independent SQLite knowledge libraries with keyword plus vector hybrid search. Use when searching...

0· 59·0 current·0 all-time
byMozi Arasaka@mozi1924
Security Scan
Capability signals
Requires OAuth token
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (local hybrid SQLite retrieval and indexing) match the included scripts and declared runtime needs. The only required binary is python3 and the code implements indexing, search, inspect, status, meta, and benchmark workflows that align with the stated purpose.
Instruction Scope
SKILL.md and the scripts consistently instruct the agent to read workspace config (.literag/knowledge-libs.json), iterate configured source paths, and run local indexing/search scripts. This is expected for a local retrieval tool. Note: the indexer will read files under the configured library paths and will send texts to the configured embedding endpoint (embedding.baseUrl) during indexing/search, which is necessary for embedding-based retrieval.
Install Mechanism
There is no automatic install spec; SKILL.md recommends running pip install -r requirements.txt. requirements.txt contains sqlite-vec (a native-backed package). No remote arbitrary downloads or URL/extract installs are present in the bundle. Installing sqlite-vec may require native build support or a Python+SQLite build that allows loading SQLite extensions—this is an expected but higher-footprint dependency for vector search.
Credentials
The skill declares no required env vars and reads only OPENCLAW_WORKSPACE / WORKSPACE / LITERAG_PYTHON for workspace resolution and preferred python. However, sensitive credentials (embedding.apiKey) are stored in the librag config (.literag/knowledge-libs.json) rather than environment variables; the skill will use that apiKey and the embedding.baseUrl to contact an embedding provider. This is proportionate for an indexer, but you should verify the config and endpoint are trusted before indexing.
Persistence & Privilege
The skill is not always-enabled (always: false) and is user-invocable only. It does not request persistent platform privileges. It stores DBs/config under the workspace (.literag/) which is expected for a local indexer.
Assessment
This skill appears to be what it says: a local SQLite-based hybrid search/indexer. Before installing/use: 1) Inspect <workspace>/.literag/knowledge-libs.json — it may contain an embedding.baseUrl and apiKey; ensure the endpoint is trusted (default is localhost) because document text will be sent there for embeddings. 2) Review the configured library 'paths' to confirm only the intended files will be indexed (the tool will read and store chunks from those paths). 3) Install requirements (sqlite-vec) in a controlled environment — it may require native extensions or a Python build with SQLite extension-loading support. 4) Be aware that the sqlite DBs and metadata live under <workspace>/.literag/ and should be protected if they contain sensitive content. If you want to avoid any network leakage, set embedding.baseUrl to a trusted local endpoint or disable vector embeddings in config.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📚 Clawdis
Binspython3
latestvk976jnmj0czv3bvn290p85p2qh84ecm1
59downloads
0stars
2versions
Updated 1w ago
v0.2.2
MIT-0

LiteRAG

Use this skill when the target corpus is too large or too noisy for main agent memory.

Install

Packaged dependency install:

python3 -m pip install -r {baseDir}/requirements.txt

Layout

  • Config + databases live under <workspace>/.literag/
  • Main config: <workspace>/.literag/knowledge-libs.json
  • Default workspace resolution order: OPENCLAW_WORKSPACEWORKSPACE → walk upward from the current path until the OpenClaw workspace sentinel files are found
  • Core scripts live under skills/literag/scripts/
  • Skill bin entrypoint: skills/literag/bin/literag
  • Workspace convenience wrappers live at scripts/literag-query.py, scripts/literag-index.py, scripts/literag-status.py, scripts/literag-meta.py, and scripts/lq

Rules

  • Keep personal/work memory in OpenClaw builtin memory
  • Keep large external corpora in LiteRAG, not memory_search
  • Treat each knowledge base as an independent library with its own SQLite
  • Search first, inspect second
  • Prefer grouped document hits over raw chunk spam
  • Prefer source-relative paths when citing files back to the user
  • Use local OpenAI-compatible embeddings by default unless explicitly changed in config

Read these files when needed

  • Always read <workspace>/.literag/knowledge-libs.json when targeting a library or changing config
  • Read references/usage.md when you need command examples, output schema, or the intended search → inspect workflow
  • Read references/configuration.md when adding libraries, source roots, excludes, chunking overrides, or ranking overrides
  • Read references/agent-prompts.md when another agent / ACP harness needs a ready-made LiteRAG prompt template
  • Read references/optimization-playbook.md when a specific library needs retrieval-quality tuning, ranking cleanup, or indexing-throughput tuning
  • Read scripts under skills/literag/scripts/ only when editing behavior or diagnosing bugs

Slash / user-invocable usage

When invoked as /literag ..., parse the remaining argument string as a subcommand.

Supported forms:

  • /literag search <library> <query>
  • /literag inspect <library> <path> [--start N --end N]
  • /literag index <library> [--limit-files N] [--embedding-batch-size N]
  • /literag index-all [--limit-files N] [--embedding-batch-size N]
  • /literag status <library>
  • /literag meta <library>
  • /literag benchmark <library> --query ...

If the user gives a natural-language request instead of a strict subcommand, translate it to the nearest supported operation instead of being pedantic.

Supported commands

  • index_library.py — index one library
  • index_all.py — index all configured libraries
  • search_library.py — grouped hybrid/fts/vector retrieval
  • inspect_result.py — expand a hit by file path + chunk range
  • status_library.py — show index health / compatibility / counts
  • meta_library.py — dump raw sqlite meta records
  • benchmark_library.py — benchmark hybrid/fts/vector latency + hit shape across fixed query sets
  • bin/literag — packaged CLI entrypoint for search / inspect / index / status / meta / benchmark
  • scripts/literag-query.py — query/search/inspect wrapper
  • scripts/literag-index.py — index wrapper for one library or all libraries
  • scripts/literag-status.py — status wrapper
  • scripts/literag-meta.py — meta wrapper
  • scripts/literag-benchmark.py — benchmark wrapper
  • scripts/lq — tiny shell alias for literag-query.py

Operating workflow

  1. Read <workspace>/.literag/knowledge-libs.json
  2. Resolve the target library
  3. Run search_library.py for grouped retrieval
  4. If needed, run inspect_result.py on the top hit or chosen range
  5. For quick operator use, prefer scripts/literag-query.py or scripts/lq
  6. Use scripts/literag-index.py when you need a short indexing entrypoint
  7. Use scripts/literag-status.py before debugging weird retrieval or after config changes
  8. Use scripts/literag-meta.py when you need the raw stored metadata
  9. Use scripts/literag-benchmark.py or skills/literag/scripts/benchmark_library.py when you need repeatable retrieval latency / hit-shape comparisons
  10. Keep LiteRAG separate from builtin memory unless the user explicitly wants a durable summary copied into workspace memory

Current intent

Use LiteRAG for:

  • Blender manual + Blender Python reference
  • Future blog/article/site knowledge bases
  • Any large external docs where hybrid retrieval is needed without polluting builtin memory

Comments

Loading comments...