Rag Pipelines

Workflows

Deep RAG workflow—document ingestion, chunking, metadata, retrieval and reranking, grounding and citations, evaluation, and failure modes (hallucination, staleness). Use when building or debugging retrieval-augmented generation systems.

Install

openclaw skills install rag-pipelines

RAG Pipelines (Deep Workflow)

RAG quality is dominated by chunking, retrieval, and evaluation—not the LLM alone. Treat the system as data engineering plus generation with explicit failure modes.

When to Offer This Workflow

Trigger conditions:

Building Q&A over internal docs, support assistants, or copilots
Hallucinations, wrong citations, or stale answers
New content types (PDF, HTML, code repositories)

Initial offer:

Use six stages: (1) task & success criteria, (2) ingestion & cleaning, (3) chunking & metadata, (4) retrieval & rerank, (5) generation & grounding, (6) evaluation & monitoring). Confirm embedding model and retrieval stack (vector DB, search engine, hybrid).

Stage 1: Task & Success Criteria

Goal: Define what a “good” answer contains: required citations, length, tone, and when to refuse.

Exit condition: Written rubric with examples of acceptable vs unacceptable answers.

Stage 2: Ingestion & Cleaning

Goal: Deterministic text extraction (strip boilerplate, handle PDF/OCR if needed); deduplicate documents; track source URL and updated_at for staleness.

Practices

Version pipelines when parsers change (re-embed job)

Stage 3: Chunking & Metadata

Goal: Tune chunk size and overlap to query patterns—not one global token count for all content.

Practices

Attach metadata for ACL filtering (tenant, product area)
Prefer structure-aware splits for docs (headings, sections)

Stage 4: Retrieval & Rerank

Goal: Hybrid lexical + dense retrieval often beats vector-only for keyword-heavy queries.

Practices

Cross-encoder reranking on top-k for quality (watch latency)
Query rewriting for multi-turn contexts

Stage 5: Generation & Grounding

Goal: System prompts that require using only provided context; explicit “not found” behavior; optional citation format (snippet, doc id, link).

Stage 6: Evaluation & Monitoring

Goal: Offline golden questions with expected supporting docs; online thumbs-down reasons; monitor retrieval hit rate, nDCG@k, and age of sources used.

Final Review Checklist

Rubric and refusal behavior defined
Ingestion deterministic; dedupe and versioning
Chunking and metadata match queries and ACLs
Hybrid retrieval and rerank tuned with metrics
Grounding and citation behavior enforced in prompts
Offline eval plus production monitoring

Tips for Effective Guidance

Debug retrieval before blaming the LLM.
Long chunks hurt precision; short chunks hurt context—sweep experiments.
See also vector-databases and llm-evaluation skills for depth.

Handling Deviations

Code RAG: symbol- or AST-aware chunking often beats line-based splits.
High-stakes domains: add human review gates and audit logs for sources cited.

Rag Pipelines

Install

RAG Pipelines (Deep Workflow)

When to Offer This Workflow

Stage 1: Task & Success Criteria

Stage 2: Ingestion & Cleaning

Practices

Stage 3: Chunking & Metadata

Practices

Stage 4: Retrieval & Rerank

Practices

Stage 5: Generation & Grounding

Stage 6: Evaluation & Monitoring

Final Review Checklist

Tips for Effective Guidance

Handling Deviations

Related skills