Rag Retriever

Security checks across static analysis, malware telemetry, and agentic risk

Overview

This looks like a normal RAG document-retrieval skill, but it can store indexed document data locally and can send text to OpenAI if the OpenAI embedding option is used.

Before installing, decide whether you want documents indexed locally and whether any content should be sent to OpenAI for embeddings. For sensitive or private files, prefer the local/simple embedding mode and avoid setting OPENAI_API_KEY unless needed. The provided behavior appears purpose-aligned, but only add documents you are comfortable having stored and reused as RAG context.

Static analysis

Env credential access

Critical
Finding
Environment variable access combined with network send.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

If OpenAI embeddings are enabled, the skill may use your OpenAI account and incur API usage under that key.

Why it was flagged

The skill can use an OpenAI API key from the environment. That is expected for an OpenAI embedding provider, but the registry metadata says no env vars are required or declared.

Skill content
this.apiKey = options.apiKey || process.env.OPENAI_API_KEY;
Recommendation

Only set OPENAI_API_KEY if you intend to use OpenAI embeddings; otherwise use the local/simple embedding mode.

What this means

Document text or query text embedded through OpenAI may leave your machine and be processed by that provider.

Why it was flagged

When the OpenAI provider is selected, the text being embedded is sent to OpenAI's embeddings API. This is purpose-aligned but affects privacy for indexed documents or queries.

Skill content
fetch('https://api.openai.com/v1/embeddings', { ... body: JSON.stringify({ input: text, model: this.model, dimensions: this.dimensions }) })
Recommendation

Do not use the OpenAI embedding provider for sensitive documents unless you are comfortable with OpenAI processing that text; choose local embeddings for private data.

What this means

Private or untrusted documents added to the index can be retrieved later and included in the assistant's context.

Why it was flagged

The skill stores indexed document content for later retrieval and formats retrieved chunks as RAG context. This is the intended function, but persistent retrieved context can carry stale, sensitive, or untrusted document content into later model prompts.

Skill content
dbPath: './data/lancedb' ... await rag.addDocument(text, { source: 'github' ... }); ... const ragResult = await rag.retrieveForRAG('什么是 MCP', { limit: 3 });
Recommendation

Index only documents you want the agent to reuse, separate trusted and untrusted collections, and clear or rebuild the local index when documents should no longer be used.

What this means

Installing the skill may download and install npm dependencies on your machine.

Why it was flagged

The skill relies on a manual npm install step, and package.json declares dependencies for transformers, LanceDB, jieba, and Arrow. This is normal for the stated RAG functionality, but users should notice that third-party packages will be installed.

Skill content
cd skills/rag-retriever
npm install
Recommendation

Install only from a trusted copy of the skill, review package-lock/package.json if needed, and use normal npm supply-chain precautions.