Epistemic Council

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate local reasoning skill, but it can run scripts and write persistent files from broad trigger phrases with limited user control.

Install only if you intend this skill to run local Python pipeline scripts, write persistent reasoning/audit files, and send prompts to a local Ollama-compatible model service. Use it in a dedicated workspace, avoid feeding it secrets or private documents unless the local model and filesystem are trusted, and prefer explicit invocations over relying on generic trigger matching.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (18)

Intent-Code Divergence

High

Confidence: 93% confidence
Finding: The class-level contract explicitly claims there are no UPDATE or DELETE paths, but prune_visibility() later performs an UPDATE on existing rows. This breaks the stated append-only integrity model and can mislead downstream consumers, auditors, or security controls into trusting immutability guarantees that are not actually enforced.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The method comment says pruning only affects visibility and never deletes, but the implementation mutates stored state with an UPDATE before appending an audit event. In a provenance/audit substrate, silent mutation of prior records undermines forensic reliability and can allow historical records to be hidden or altered from their original state without a purely append-only trail.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The trigger list includes generic phrases such as "health check," "find gaps," and "check boundaries," which can plausibly appear in normal user conversation outside the intended security workflow. If the agent uses trigger matching to auto-select skills, these broad phrases could cause unintended execution of this skill and invoke the `exec` tool against a local workspace, expanding the blast radius from a simple routing mistake into code execution in a sensitive directory.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code sends `query_text` and `insight_text` directly to an HTTP model endpoint without any visible consent, minimization, or classification of the data being transmitted. Even if the default target is localhost, the URL is configurable and the content may include sensitive user or source-derived material, creating a real data exposure risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This request packages source-claim content into a prompt and sends it to the model endpoint without safeguards or disclosure. Source claims may contain proprietary, sensitive, or untrusted text, so forwarding them to another service expands the trust boundary and can leak information outside the primary system.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The evaluation step transmits the insight text and generated counter-example to the model endpoint, again without clear disclosure or controls. This compounds prior data sharing and can expose both original content and derivative analysis to a service that may log, retain, or forward prompts.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The agent persists model-derived claim text and reasoning traces to the substrate without any consent, minimization, or redaction step. If queries or context contain secrets, personal data, or sensitive prompts, those may be stored long-term and later exposed through logs, database access, backups, or downstream consumers.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The orchestrator initializes challenge agents with a network endpoint defaulting to `http://localhost:11434`, and later passes `insight_dict`, `query_text`, and source-claim content into those agents. In environments where that endpoint is remote, proxied, container-shared, or otherwise not fully trusted, sensitive insight/query data may be transmitted off-process or off-host without user awareness, creating an inadvertent data-exfiltration and privacy risk.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The dispatch logic uses broad substring matching on natural-language input, so ordinary text containing phrases like 'run pipeline' or 'epistemic reasoning' can trigger execution of local scripts with workspace write access and shell-enabled runtime context. In an agent setting, this increases the chance of unintended side-effectful execution from prompt content, quoted text, or indirect user input.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The code sends up to 6000 characters of memory log content to a local Ollama HTTP service without consent, disclosure, minimization, or validation of what the local endpoint actually is. In this skill context, the logs may contain sensitive long-term agent memory, and trusting localhost is weaker than it appears because local services can be swapped, proxied, or exposed by misconfiguration.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger list is broad and includes generic phrases like 'run pipeline', 'find analogies', and 'epistemic reasoning' that could match ordinary user requests and invoke the skill unintentionally. Because this skill also has shell execution and write access, accidental activation expands the chance that powerful operations run in contexts where the user did not explicitly request them.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The manifest grants read/write workspace access, filesystem write targets, and shell execution, but the description presents the skill as a reasoning/validation tool without warning that it can modify files or execute commands. This mismatch is dangerous because users or orchestrators may treat it as low-risk analysis logic while it actually has privileged capabilities that could alter state, persist data, or run harmful local commands if misused or compromised.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: prune_visibility() performs a state-changing UPDATE that changes how records are exposed, but the system is presented as an append-only provenance store. In this context, hidden state changes are more dangerous because users may rely on the substrate for governance, monitoring, and auditability; an actor with access to this method can suppress visibility of events while preserving the appearance of an immutable ledger.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The code sends claim text and reasoning traces to a separate local HTTP service for LLM evaluation, which is a real data-exposure boundary even if it is bound to localhost. These fields may contain sensitive or proprietary information, and there is no runtime consent, minimization, or transport/authentication control to prevent unintended disclosure to a misconfigured, replaced, or monitored local service.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The script persists validation outputs, including claim snippets and model-generated reasoning, to disk under memory/openclaw-runs without any access-control, redaction, retention, or user-warning mechanism. If the workspace is shared, backed up, indexed, or later exfiltrated, this creates a durable leakage of potentially sensitive claim content and internal reasoning artifacts.

External Transmission

Medium

Category: Data Exfiltration
Content: Your answer (YES or NO only):""" try: response = requests.post( f"{self.model_url}/api/generate", json={"model": self.model_name, "prompt": prompt, "stream": False, "options": {"temperature": 0.3, "num_predict": 50}},
Confidence: 90% confidence
Finding: requests.post( f"{self.model_url}/api/generate", json=

External Transmission

Medium

Category: Data Exfiltration
Content: Generate:""" try: response = requests.post( f"{self.model_url}/api/generate", json={"model": self.model_name, "prompt": prompt, "stream": False, "options": {"temperature": 0.7, "num_predict": 200}},
Confidence: 91% confidence
Finding: requests.post( f"{self.model_url}/api/generate", json=

External Transmission

Medium

Category: Data Exfiltration
Content: Your judgment (A, B, or C with brief explanation):""" try: response = requests.post( f"{self.model_url}/api/generate", json={"model": self.model_name, "prompt": prompt, "stream": False, "options": {"temperature": 0.3, "num_predict": 150}},
Confidence: 89% confidence
Finding: requests.post( f"{self.model_url}/api/generate", json=

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal