Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

RAGLite

Local-first RAG cache: distill docs into structured Markdown, then index/query with Chroma + hybrid search (vector + keyword).

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 1.3k · 0 current installs · 0 all-time installs
byViraj Sanghvi@VirajSanghvi1
MIT-0
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's stated purpose (local-first RAG cache using Chroma + ripgrep) matches the files and scripts. However the runtime intentionally defaults to the external OpenClaw engine unless the user overrides it, which conflicts with a purely 'local-first' expectation; the SKILL.md does mention the default but the install/script behavior enforces it silently.
!
Instruction Scope
Runtime instructions and scripts create a venv and invoke 'raglite' from the installed package. The launcher script silently injects '--engine openclaw' when the user doesn't supply --engine, which can cause documents or queries to be sent to an OpenClaw gateway by default. SKILL.md references Chroma and ripgrep and instructs interacting with network endpoints (Chroma server, OpenClaw gateway) — these are within the tool's domain, but the automatic defaulting to an external engine is behavior users may not expect and could lead to unintended data transmission.
!
Install Mechanism
The install script uses pip to install directly from a personal GitHub repo via 'git+https://github.com/VirajSanghvi1/raglite.git@main'. This is a common pattern but higher risk than installing from a pinned release or well-known package index: it pulls code from an upstream main branch (not a fixed tag), so upstream changes could alter behavior after install. No other unusual downloads or obfuscated installers were found.
!
Credentials
The skill declares no required env vars, yet SKILL.md references OPENCLAW_GATEWAY_TOKEN (used if the gateway requires auth) and a Chroma URL. Because the launcher defaults to the OpenClaw engine, an external gateway and its token become relevant to normal runs even though they are not declared as required. That mismatch makes credential use/need non-obvious to users and increases the risk of accidental exposure of sensitive documents.
Persistence & Privilege
The skill is not always-enabled, does not request system-wide config paths or credentials, and does not modify other skills. It installs into a skill-local virtualenv, which is a contained install pattern.
Scan Findings in Context
[NO_PRESCAN_ISSUES] expected: Static pre-scan reported no injection signals; this is expected for an instruction-heavy skill with simple shell scripts. Absence of findings is not evidence of safety — dynamic network behavior and upstream package contents still matter.
What to consider before installing
Before installing: 1) Be aware the skill will, by default, use the OpenClaw engine unless you explicitly pass --engine; that may send data to an external gateway. If you want purely local operation, always pass an explicit local engine and/or verify raglite's defaults. 2) The installer pulls from a personal GitHub 'main' branch (un-pinned); review the upstream repo or pin a specific tag/commit to avoid unexpected updates. 3) If you must keep data local, ensure OPENCLAW_GATEWAY_TOKEN is not set and run with a local Chroma instance; install and run in an isolated environment (container or VM) first. 4) Consider inspecting the installed raglite package source after installation (or vendor it) to confirm there are no unexpected network endpoints. If you are not comfortable reviewing the upstream repo or exposing data to an external gateway, treat this skill as potentially risky.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk9716fzfz4snwysakwhh1bw9qn80jy0c

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🔎 Clawdis
OSmacOS · Linux
Binspython3, pip

SKILL.md

RAGLite — a local RAG cache (not a memory replacement)

RAGLite is a local-first RAG cache.

It does not replace model memory or chat context. It gives your agent a durable place to store and retrieve information the model wasn’t trained on — especially useful for local/private knowledge (school work, personal notes, medical records, internal runbooks).

Why it’s better than “paid RAG” / knowledge bases (for many use cases)

  • Local-first privacy: keep sensitive data on your machine/network.
  • Open-source building blocks: Chroma 🧠 + ripgrep ⚡ — no managed vector DB required.
  • Compression-before-embeddings: distill first → less fluff/duplication → cheaper prompts + more reliable retrieval.
  • Auditable artifacts: the distilled Markdown is human-readable and version-controllable.

If you later outgrow local, you can swap in a hosted DB — but you often don’t need to.

What it does

1) Condense ✍️

Turns docs into structured Markdown outputs (low fluff, more “what matters”).

2) Index 🧠

Embeds the distilled outputs into a Chroma collection (one DB, many collections).

3) Query 🔎

Hybrid retrieval:

  • vector similarity via Chroma
  • keyword matches via ripgrep (rg)

Default engine

This skill defaults to OpenClaw 🦞 for condensation unless you pass --engine explicitly.

Prereqs

  • Python 3.11+
  • For indexing/query:
    • Chroma server reachable (default http://127.0.0.1:8100)
  • For hybrid keyword search:
    • rg installed (brew install ripgrep)
  • For OpenClaw engine:
    • OpenClaw Gateway /v1/responses reachable
    • OPENCLAW_GATEWAY_TOKEN set if your gateway requires auth

Install (skill runtime)

This skill installs RAGLite into a skill-local venv:

./scripts/install.sh

It installs from GitHub:

  • git+https://github.com/VirajSanghvi1/raglite.git@main

Usage

One-command pipeline (recommended)

./scripts/raglite.sh run /path/to/docs \
  --out ./raglite_out \
  --collection my-docs \
  --chroma-url http://127.0.0.1:8100 \
  --skip-existing \
  --skip-indexed \
  --nodes

Query

./scripts/raglite.sh query ./raglite_out \
  --collection my-docs \
  --top-k 5 \
  --keyword-top-k 5 \
  "rollback procedure"

Outputs (what gets written)

In --out you’ll see:

  • *.tool-summary.md
  • *.execution-notes.md
  • optional: *.outline.md
  • optional: */nodes/*.md plus per-doc *.index.md and a root index.md
  • metadata in .raglite/ (cache, run stats, errors)

Troubleshooting

  • Chroma not reachable → check --chroma-url, and that Chroma is running.
  • No keyword results → install ripgrep (rg --version).
  • OpenClaw engine errors → ensure gateway is up and token env var is set.

Pitch (for ClawHub listing)

RAGLite is a local RAG cache for repeated lookups.

When you (or your agent) keep re-searching for the same non-training data — local notes, school work, medical records, internal docs — RAGLite gives you a private, auditable library:

  1. Distill to structured Markdown (compression-before-embeddings)
  2. Index locally into Chroma
  3. Query with hybrid retrieval (vector + keyword)

It doesn’t replace memory/context — it’s the place to store what you need again.

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…