rag-eval

Security checks across malware telemetry and agentic risk

Overview

This RAG evaluation skill does what it claims, but it can send and save sensitive RAG content without strong disclosure or controls, so it should be reviewed before use.

Install only if you are comfortable with evaluation questions, answers, and retrieved context being processed by the configured judge provider and saved locally. For confidential data, prefer a local RAGAS_LLM/Ollama setup, use --no-save for one-off evaluations, avoid batch mode unless you accept its raw per-sample logs, run setup in a virtual environment, and regularly review or delete memory/eval-results.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (7)

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill instructs saving evaluation inputs/results into persistent memory files even though its stated purpose is transient quality evaluation. Those records can contain sensitive user prompts, model answers, and retrieved context chunks, creating unnecessary data retention and increasing exposure if memory files are later accessed, indexed, or exfiltrated. The RAG context may include proprietary documents, so persistence makes this more dangerous than a generic metrics tool.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The script persists evaluation results under memory/eval-results even though its stated purpose is only to score RAG quality. In this context, questions, answers, and retrieved contexts may contain sensitive user prompts, proprietary documents, or internal knowledge-base excerpts, so creating local retention by default expands data exposure beyond what is necessary for evaluation.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The code writes the full input payload—question, answer, and all retrieved contexts—to disk, which can capture confidential prompts, model outputs, and source text verbatim. For a RAG evaluation utility, storing raw content is unnecessarily invasive and increases the blast radius if the host is shared, compromised, backed up externally, or logs are later inspected by unauthorized users.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger phrases are broad enough that normal user requests like 'check hallucination' or 'quality check' could invoke the skill unintentionally. In an agent setting, this can cause surprise execution, accidental processing of sensitive inputs, and downstream calls to subprocesses or external evaluators without clear user intent.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The PRD specifies saving evaluation results to disk, including question, answer, and retrieved context, but provides no warning, consent flow, retention policy, or redaction guidance. Because RAG inputs often contain proprietary or personal data, silent persistence can create privacy, compliance, and secret-retention risks if logs are later accessed or exfiltrated.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill depends on OpenAI or Anthropic-backed evaluation, which means submitted questions, answers, and contexts may be transmitted to third-party LLM providers. Without explicit warning and consent, users may unknowingly send confidential RAG data outside their trust boundary, creating significant privacy, contractual, and regulatory exposure.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill notes that an LLM judge is required, but it does not clearly warn users that questions, answers, and retrieved contexts may be transmitted to third-party providers such as OpenAI or Anthropic. In a RAG setting, contexts often contain internal or sensitive source material, so sending them off-box without explicit warning can cause confidentiality and compliance issues. This is a real security/privacy weakness caused by incomplete disclosure.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal