Praesidia

Security checks across malware telemetry and agentic risk

Overview

Praesidia is a coherent integration for checking AI-agent trust and managing guardrails, with important privacy and change-control considerations but no evidence of hidden or malicious behavior.

Install this only if you intend OpenClaw to use Praesidia and trust Praesidia with agent metadata and any messages you ask it to validate. Use a scoped API key where possible, avoid submitting sensitive content unnecessarily, and explicitly review the agent ID, organization ID, scope, action, and rollback path before applying guardrails.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (6)

Description-Behavior Mismatch

Low

Confidence: 84% confidence
Finding: The skill supports sending arbitrary user-provided content to an external validation endpoint, but this inspection behavior is not clearly disclosed in the manifest description. That mismatch can cause users or invoking systems to route sensitive content into a third-party API without informed consent, creating privacy and data-handling risk.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The skill advertises very broad natural-language activation conditions around safety, discovery, trust, guardrails, and generic phrases like asking whether an agent is safe. In an agentic environment, overly broad triggers can cause the skill to activate in unrelated contexts and initiate external API lookups or trust recommendations without clear user intent, increasing the chance of unintended data exposure or action selection.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The example user prompts are open-ended and operational, including requests that could trigger listing agents, evaluating safety, validating content, or modifying guardrails. Because these examples help define invocation behavior, they may cause the host assistant to route loosely related user messages into this skill and perform authenticated operations without sufficient confirmation or least-privilege checks.

Vague Triggers

Medium

Confidence: 79% confidence
Finding: The invocation text is very broad and includes generic security and moderation phrases, which may cause this skill to activate on common requests outside narrow agent-verification scenarios. Over-broad routing increases the chance that unrelated prompts, sensitive content, or administrative actions are handed to this external-integration skill unexpectedly.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill can apply guardrails that block, redact, replace, retry, or escalate content, which materially changes agent behavior and may disrupt workflows or alter outputs. Without an explicit warning and confirmation step, users may trigger impactful configuration changes unintentionally, especially because some actions affect both input and output paths.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The content-validation flow instructs sending user message text to an external API but does not warn that the content leaves the local environment for third-party analysis. This can expose sensitive prompts, PII, regulated data, or proprietary information without adequate user awareness or consent.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal