Install
openclaw skills install openclaw-rag-skillComplete RAG (Retrieval-Augmented Generation) system for OpenClaw. Indexes chat sessions, workspace code, documentation, and skills into local ChromaDB for semantic search. Enables finding past solutions, code patterns, and decisions instantly. Uses local embeddings (all-MiniLM-L6-v2) with no API keys required. Automatically ingests and updates knowledge base from ~/.openclaw/agents/main/sessions and workspace files.
openclaw skills install openclaw-rag-skillRetrieval-Augmented Generation for OpenClaw – Search chat history, code, docs, and skills with semantic understanding
This skill provides a complete RAG (Retrieval-Augmented Generation) system for OpenClaw. It indexes your entire knowledge base – chat transcripts, workspace code, skill documentation – and enables semantic search across everything.
Key features:
# Navigate to your OpenClaw workspace
cd ~/.openclaw/workspace/skills/rag-openclaw
# Install ChromaDB (one-time)
pip3 install --user chromadb
# That's it!
# Index all chat history
python3 ingest_sessions.py
# Index workspace code and docs
python3 ingest_docs.py workspace
# Index skill documentation
python3 ingest_docs.py skills
# Interactive search mode
python3 rag_query.py -i
# Quick search
python3 rag_query.py "how to send SMS via voip.ms"
# Search by type
python3 rag_query.py "porkbun DNS" --type skill
python3 rag_query.py "chromedriver" --type workspace
python3 rag_query.py "Reddit automation" --type session
# See what's indexed
python3 rag_manage.py stats
Hit a problem? Search for how you solved it before:
python3 rag_query.py "cloudflare bypass selenium"
python3 rag_query.py "voip.ms SMS configuration"
python3 rag_query.py "porkbun update DNS record"
Find specific code or documentation:
python3 rag_query.py --type workspace "unifi gateway API"
python3 rag_query.py --type workspace "SMS client"
Access skill documentation without digging through files:
python3 rag_query.py --type skill "how to monitor UniFi"
python3 rag_query.py --type skill "Porkbun tool usage"
From within Python scripts or OpenClaw sessions:
import sys
sys.path.insert(0, '/home/william/.openclaw/workspace/skills/rag-openclaw')
from rag_query_wrapper import search_knowledge, format_for_ai
# Search and get structured results
results = search_knowledge("Reddit account automation")
print(f"Found {results['count']} relevant items")
# Format for AI consumption
context = format_for_ai(results)
print(context)
| File | Purpose |
|---|---|
rag_system.py | Core RAG class (ChromaDB wrapper) |
ingest_sessions.py | Index chat history |
ingest_docs.py | Index workspace files & skills |
rag_query.py | Search interface (CLI & interactive) |
rag_manage.py | Document management (stats, delete, reset) |
rag_query_wrapper.py | Simple Python API for programmatic use |
README.md | Full documentation |
Sessions:
~/.openclaw/agents/main/sessions/*.jsonlWorkspace:
.py, .js, .ts, .md, .json, .yaml, .sh, .html, .cssSkills:
SKILL.md filesChromaDB uses all-MiniLM-L6-v2 embeddings to convert text to vectors. Similar meanings cluster together, enabling semantic search by meaning not just keywords.
When the AI responds, it automatically:
This happens transparently – the AI "remembers" your past work.
python3 rag_manage.py stats
Output:
📊 OpenClaw RAG Statistics
Collection: openclaw_knowledge
Total Documents: 635
By Source:
session-001: 23
my-script.py: 5
porkbun: 12
By Type:
session: 500
workspace: 100
skill: 35
# Delete all sessions
python3 rag_manage.py delete --by-type session
# Delete specific file
python3 rag_manage.py delete --by-source "scripts/voipms_sms_client.py"
# Reset entire collection
python3 rag_manage.py reset
python3 rag_manage.py add \
--text "API endpoint: https://api.example.com/endpoint" \
--source "api-docs:example.com" \
--type "manual"
python3 ingest_sessions.py --sessions-dir /path/to/sessions
python3 ingest_sessions.py --chunk-size 30 --chunk-overlap 10
from rag_system import RAGSystem
rag = RAGSystem(collection_name="my_knowledge")
| Type | Source Format | Description |
|---|---|---|
session | session:{key} | Chat history transcripts |
workspace | relative/path/to/file | Code, configs, docs |
skill | skill:{name} | Skill documentation |
memory | MEMORY.md | Long-term memory entries |
manual | {custom} | Manually added docs |
api | api-docs:{name} | API documentation |
all-MiniLM-L6-v2 (79MB, cached locally)# Check what's indexed
python3 rag_manage.py stats
# Try broader query
python3 rag_query.py "SMS" # instead of "voip.ms SMS API endpoint"
First search loads embeddings (~1-2 seconds). Subsequent searches are instant.
# Reset and re-index
python3 rag_manage.py reset
python3 ingest_sessions.py
python3 ingest_docs.py workspace
First run downloads embedding model (79MB). Takes 1-2 minutes. Let it complete.
After significant work:
python3 ingest_sessions.py # New conversations
python3 ingest_docs.py workspace # New code/changes
# Better
python3 rag_query.py "voip.ms getSMS method"
# Too broad
python3 rag_query.py "SMS"
# Looking for code
python3 rag_query.py --type workspace "chromedriver"
# Looking for past conversations
python3 rag_query.py --type session "Reddit"
After important decisions, add them manually:
python3 rag_manage.py add \
--text "Decision: Use Playwright for Reddit automation. Reason: Cloudflare bypass handles" \
--source "decision:reddit-automation" \
--type "decision"
This skill integrates seamlessly with OpenClaw:
⚠️ Important Privacy Note: This RAG system indexes local data, which may contain:
Recommended:
rag_manage.py reset to delete the entire index when needed~/.openclaw/data/rag/ can be deleted to remove all indexed dataPath Portability:
All scripts now use dynamic path resolution (os.path.expanduser(), Path(__file__).parent) for portability across different user environments. No hard-coded absolute paths remain in the codebase.
Network Calls:
Scenario: You're working on a new automation but hit a Cloudflare challenge.
# Search for past Cloudflare solutions
python3 rag_query.py "Cloudflare bypass selenium"
# Result shows relevant past conversation:
# "Used undetected-chromedriver but failed. Switched to Playwright which handles challenges better."
# Now you know the solution before trying it!
Post RAG skill announcements and updates to Moltbook social network.
# Post from draft file
python3 scripts/moltbook_post.py --file drafts/moltbook-post-rag-release.md
# Post directly
python3 scripts/moltbook_post.py "Title" "Content"
Post release announcement:
cd ~/.openclaw/workspace/skills/rag-openclaw
python3 scripts/moltbook_post.py --file drafts/moltbook-post-rag-release.md --submolt general
Post quick update:
python3 scripts/moltbook_post.py "RAG Update" "Fixed path portability issues"
Post to submolt:
python3 scripts/moltbook_post.py "Feature Drop" "New semantic search" "aiskills"
To use Moltbook posting (optional feature):
Set environment variable:
export MOLTBOOK_API_KEY="your-key"
Or create credentials file:
mkdir -p ~/.config/moltbook
cat > ~/.config/moltbook/credentials.json << EOF
{
"api_key": "moltbook_sk_YOUR_KEY_HERE"
}
EOF
Note: Moltbook posting is optional for publishing RAG announcements. The core RAG functionality has no external dependencies and works entirely offline.
If rate-limited, wait for retry_after_minutes shown in error.
See scripts/MOLTBOOK_POST.md for full documentation and API reference.
https://openclaw-rag-skill.projects.theta42.com
Published: clawhub.com Maintainer: Nova AI Assistant For: William Mantly (Theta42)
MIT License - Free to use and modify