Moses Governance Single

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local governance and audit helper, with some privacy and control caveats but no evidence of exfiltration, destructive behavior, or deceptive install behavior.

Install only if you want a local, instruction-based governance layer that writes persistent audit/state files. Treat ~/.openclaw/audits/moses and ~/.openclaw/governance as sensitive, avoid logging secrets or private details, set MOSES_OPERATOR_SECRET only in a trusted shell, and use unrestricted or offense modes deliberately.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (8)

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The script is presented as an append-only audit ledger, but on certain outcomes it also mutates a separate governance progress file by setting recovery flags. This hidden side effect expands the component's authority from logging into state control, which can let any caller that can invoke the audit command influence governance workflow and create integrity or availability issues in downstream automation.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The mode catalog uses broad, overlapping activation guidance such as 'anything where correctness matters' or 'ambiguous situations' without a deterministic selection policy. In a governance skill, this can lead to inconsistent safety posture selection, including accidental downgrade to weaker controls when handling sensitive tasks.

Natural-Language Policy Violations

High

Confidence: 97% confidence
Finding: The 'None (Unrestricted)' mode explicitly removes behavioral constraints based on operator acceptance of risk. This creates a built-in safety bypass that can be invoked through prompt manipulation or social engineering, undermining the governance purpose of the skill.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The utility silently modifies ~/.openclaw/governance/progress.json when certain outcome strings are present, without explicit user acknowledgement. In a governance skill, undisclosed state changes are more dangerous because other components may trust this file to trigger recovery behavior, allowing a low-friction path to alter control flow or cause persistent false failure states.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The reset command overwrites persisted governance state immediately and without confirmation, making accidental or scripted destruction of configuration easy. In a governance/security-control skill, this is more dangerous because state resets can silently revert posture, mode, and role to defaults, undermining expected operational safeguards or auditability.

Ssd 1

High

Confidence: 98% confidence
Finding: By semantically authorizing removal of safety constraints on request, the reference normalizes operator-directed disablement of safeguards. In a single-agent governance framework, that weakens trust boundaries and makes abuse easier because the override is described as legitimate behavior.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The instruction to 'Log full reasoning chain' can cause storage of sensitive user content, secrets, and internal deliberations in plain language. Detailed chain-of-thought retention increases privacy, compliance, and prompt-extraction risk without being necessary for most audit objectives.

Ssd 3

Medium

Confidence: 90% confidence
Finding: Self Growth mode directs the system to reflect on prior interactions and maintain a growth log without defining consent, retention, or sensitivity boundaries. This can encourage persistent storage or reuse of user interaction data beyond the immediate session.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal