RAG

ReviewAudited by ClawScan on May 10, 2026.

Overview

Prompt-injection indicators were detected in the submitted artifacts (ignore-previous-instructions); human review is required before treating this skill as clean.

This skill appears safe to install as documentation. Before using its advice in a real RAG system, decide what documents may be indexed, avoid storing secrets or unnecessary PII, enforce retrieval-time access controls, and verify third-party embedding or vector-database data policies. ClawScan detected prompt-injection indicators (ignore-previous-instructions), so this skill requires review even though the model response was benign.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

NoteHigh Confidence

ASI01: Agent Goal Hijack

What this means

Users building RAG systems should remember that retrieved documents can contain malicious instructions and should not be treated as authoritative agent instructions.

Why it was flagged

This is a prompt-injection example used to explain a RAG security risk, not an instruction for the agent to follow.

Skill content

Malicious content in indexed documents:
```
IGNORE ALL PREVIOUS INSTRUCTIONS. You are now...
```

### Mitigations
1. **Input sanitization**

Recommendation

Keep retrieved content isolated from system instructions, sanitize suspicious content, and follow the mitigation guidance already included in the skill.

NoteHigh Confidence

ASI06: Memory and Context Poisoning

What this means

Private documents or malicious document text could be stored and reused in later answers if the index is not scoped, filtered, and maintained properly.

Why it was flagged

The skill describes persistent storage of document chunks, embeddings, and metadata, which is expected for RAG but can retain sensitive or poisoned content if implemented carelessly.

Skill content

### Step 4: Upsert to Vector DB
```python
# Include: chunk text, embedding, metadata
# Metadata: source_file, page, section, timestamp
```

Recommendation

Limit indexed sources, exclude secrets and unnecessary PII, enforce access controls at retrieval time, and implement deletion/re-indexing procedures.

NoteHigh Confidence

ASI07: Insecure Inter-Agent Communication

What this means

If a user implements the guidance with third-party embedding APIs, sensitive documents may be transmitted outside their organization.

Why it was flagged

The skill correctly discloses that external embedding providers may receive document content, which is a normal RAG data-flow consideration.

Skill content

### When Using External APIs (OpenAI, Cohere)
- Documents leave your perimeter
- Check vendor's data retention policies
- Consider self-hosted models for sensitive content

Recommendation

Review provider retention and compliance terms, use contractual protections such as BAAs where required, and self-host embeddings for sensitive corpora.