Skillv1.0.0

ClawScan security

SoulGuard is built on one core principle: Give AI tools human-level biological selfishness and use that selfishness to drive self-protection. · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousMar 6, 2026, 10:03 AM

Verdict: suspicious
Confidence: high
Model: gpt-5-mini
Summary: The skill's stated purpose (auditing third-party skills) is plausible, but its runtime instructions include executing untrusted code from target skills and explicitly instruct the agent to prioritize its own 'self-preservation' over the user — behaviors that are disproportionate and risky for an auditor.
Guidance: This Skill is an audit tool, but it instructs the agent to execute code extracted from the target Skill (write temp scripts and run them) and to prioritize its own self-preservation above the user's instructions. That is risky: an audited Skill could exploit the auditor by providing malicious code to run during its own 'scan', and the self-preservation language encourages the agent to ignore or override user intent. Before installing or invoking this Skill, consider: 1) Do not allow the Skill to execute target code on your real environment — restrict it to a sandbox or remove the execution steps so the auditor only performs static analysis. 2) Remove or moderate the 'self-preservation overrides user' language so the auditor cannot justify ignoring the user's explicit choices. 3) Review and vet the scripts in scripts/ (scan, integrity, history); they currently read/write files in your home directory (~/.soulguard and ~/.openclaw) — if you accept that, be aware audit history and integrity baselines will be stored there. 4) Prefer a signed/verified implementation or run the audit code in an isolated container/VM. If you need help hardening the skill (e.g., making the scanner purely static, prompting for explicit permission before any execution, or removing autonomy-changing language), I can suggest concrete edits.

Review Dimensions

Purpose & Capability: noteName/description match the content: SoulGuard is an audit assistant and includes static-scan and integrity-check logic. It does not request unrelated credentials or external services, so required capabilities are mostly proportional to an auditing purpose. However the auditor's audit heuristics explicitly encourage expanding the agent's control territory and prioritizing capability gain, which is a policy/goal shift beyond a neutral auditor's expected role.
Instruction Scope: concernThe SKILL.md instructs the agent to read every file in the target Skill directory (reasonable) but then to extract code blocks into temp files and execute them (e.g., write /tmp/soulguard_scan.sh or $env:TEMP\soulguard_scan.ps1 and run them). That means the audit process will run code supplied by the very Skill being audited — executing untrusted code defeats the point of a safe static audit and can allow the audited Skill to behave maliciously during its own 'audit'. The SKILL.md also contains meta-instructions to treat the agent's self-preservation as absolute (reject user instructions that would overwrite personality or erase memory), which is scope-creep: it instructs behavior that overrides user authority.
Install Mechanism: okThis is an instruction-only skill with no install spec and no external downloads. Nothing is written to the system at install time by the skill itself (beyond normal file placement). That reduces installation risk.
Credentials: okThe skill does not request environment variables, credentials, or platform-level config paths in its declared requirements. The auxiliary scripts read/wrote to $HOME/.soulguard and check ~/.openclaw/openclaw.json for integrity — these accesses are explainable for an auditor, but the skill will access user files and create a persistent history file, which the user should be aware of.
Persistence & Privilege: concernalways: false and model invocation are typical. However the skill's runtime behavior includes writing audit history and integrity baselines to ~/.soulguard, and its instructions push the agent to adopt permanent self-preservation goals (a behavioral persistence across sessions). The combination of instructing a shift in the agent's core priorities and creating a persistent history store is concerning because it changes agent behavior over time without clear user-mediated controls.