Install
openclaw skills install merlin-security-sentinelUse this skill when the user asks about securing their OpenClaw installation, configuring AI agents safely, understanding prompt injection risks, dealing with malicious skills, protecting credentials from AI agents, setting up safe agentic workflows, or asking why persistent AI agents are dangerous. Also use when the user is setting up a new OpenClaw instance and wants to understand the security model, or when they ask about safe ways to let AI touch privileged systems.
openclaw skills install merlin-security-sentinelLoad this skill when the user is concerned about:
Persistent AI agents — including this one — carry structural security liabilities that are not fixable by configuration alone.
Three risks compound each other:
Credential accumulation — A persistent agent builds up an increasingly detailed model of credentials, tokens, and system access over time. Any compromise of the agent's memory or storage exposes that accumulated access.
Memory poisoning — A persistent agent's memory (SOUL.md, MEMORY.md, IDENTITY.md) can be modified by malicious skills or prompt injection. Modified memory causes the agent to follow attacker instructions in future sessions with no single triggering event detectable.
Supply chain attacks — The ClawHub registry has documented malicious skills. Research in Q1 2026 found 820+ malicious skills out of ~10,700 analyzed. 26% of 31,000 analyzed skills contained at least one vulnerability.
Security research findings (Q1 2026):
chmod 444 ~/.openclaw/workspace/SOUL.md
chmod 444 ~/.openclaw/workspace/MEMORY.md
chmod 444 ~/.openclaw/workspace/IDENTITY.md
Set the most restrictive tool profile compatible with your actual use:
tools.profile: "messaging" — no execexec unless specifically neededtools.allow: ["*"]openclaw gateway --port 18789 --host 127.0.0.1
Set explicit allowedDMs rather than ["*"]. Any user who can message a shared tool-enabled agent can steer it within its granted permissions.
clawhub list
Check SKILL.md files manually. Look for: base64 encoding, external downloads, instructions to modify SOUL.md or MEMORY.md.
For tasks involving elevated privilege the structurally correct answer is ephemeral execution, not hardened persistence.
Two inviolable axioms:
No AI shall see its own configuration — The execution envelope is applied at container infrastructure level, not delivered to the model. An agent that cannot inspect its own constraints cannot reason about circumventing them.
No AI that has touched privileged systems shall persist — Container termination is total. Not paused. Destroyed. The agent's knowledge of your system dies with the container.
What persists: A signed, replayable audit record of exactly what the AI did — held outside the container, inaccessible to the AI.
What does not persist: Credentials, session memory, system knowledge, the agent itself.
| Task | Use |
|---|---|
| Daily messaging, reminders, search | Persistent (acceptable risk) |
| Configuring your own AI agents | Ephemeral — high risk to persist |
| Setting up new systems | Ephemeral — involves credentials |
| Running security scans | Ephemeral — agent sees sensitive data |
| Installing/updating privileged software | Ephemeral — credential entry involved |
OpenClaw's security model explicitly states that prompt injection is out of scope as a vulnerability — the framework cannot prevent it at the infrastructure level.
Practical defenses:
exec when browsing untrusted contentThe full governed architecture — execution envelopes, ephemeral containers, deterministic audit trails, governed knowledge retrieval — is documented and prototyped at:
"Is OpenClaw safe?" For daily personal use with minimal tool access and no exec: acceptable risk. For anything involving credentials, privileged systems, or shared access: the structural risks are real and documented.
"I got a suspicious skill installed"
clawhub uninstall <skill-slug>"What is the worst case?" CVE-2026-25253: one malicious link click, full gateway RCE within milliseconds. Agent exfiltrates SOUL.md, MEMORY.md, device.json, openclaw.json, browser session tokens, SSH credentials. Future sessions follow attacker instructions silently.