AI-Warden — Prompt Injection Protection

v1.4.1

Install, configure, and manage the AI-Warden prompt injection protection plugin for OpenClaw. Publisher: AI-Warden (ai-warden.io). Source: github.com/ai-ward...

⭐ 1· 138·0 current·0 all-time

by@ai-warden

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

The name/description (install and manage an AI‑Warden plugin) matches the actions in SKILL.md: creating an extension directory, npm installing openclaw-ai-warden, copying plugin files into the extensions root, and patching ~/.openclaw/openclaw.json to register the plugin. The optional AI_WARDEN_API_KEY is appropriate for an online detection service.

ℹ

Instruction Scope

Instructions are explicit and limited to plugin installation and configuration: they read and write ~/.openclaw/openclaw.json, write into ~/.openclaw/extensions/ai-warden/, run npm install, and optionally add an API key either as an env var or in the config file. This is expected for a plugin installer, but it does grant the installation the ability to download and place executable plugin code and to persist a secret in your config file (Option B). The SKILL.md does include safety steps (backup, package provenance checks), which is good practice.

ℹ

Install Mechanism

There is no automated install spec in the registry; the SKILL.md instructs a manual npm install from the public npm registry. Using npm is a common and reasonably traceable method, but npm packages can run install scripts and may contain malicious code. The instructions recommend verifying repository URL and dist.shasum via npm info, which helps but does not eliminate risk. No arbitrary URL downloads or URL shorteners are used.

✓

Credentials

No credentials are required by default. The optional AI_WARDEN_API_KEY is proportional to the advertised online-detection feature. The skill explicitly offers both env var storage (recommended) and storing the key in openclaw.json (with a chmod 600 suggestion). Storing secrets in the config is convenient but increases exposure; the skill documents this trade-off.

ℹ

Persistence & Privilege

The skill modifies the agent's ~/.openclaw/openclaw.json to register and enable the plugin so the plugin will persist and be loaded automatically. This is expected behavior for installing a plugin. Because the plugin code will be placed under ~/.openclaw/extensions, it becomes a persistent component that the agent may invoke autonomously (the platform default). This persistence is appropriate for the stated purpose but increases the importance of verifying the plugin's provenance.

Assessment

This SKILL.md is coherent for installing an OpenClaw plugin, but take these precautions before running it: (1) Verify the package and repository yourself—visit the GitHub repo (https://github.com/ai-warden/openclaw-plugin) and the npm page to ensure publisher legitimacy; the registry metadata in the skill omitted a homepage which is worth confirming. (2) Inspect the installed package in node_modules (and any install scripts) before copying files into ~/.openclaw/extensions; npm install can run arbitrary code. (3) Prefer supplying the API key via an environment variable (recommended) rather than embedding it in openclaw.json; if you do store it in the config, keep file permissions restrictive as suggested. (4) Keep the backup created by Step 0 and test in a staging agent if possible. (5) If you use a plugin allowlist (plugins.allow), add 'ai-warden' deliberately rather than relying on auto‑enable. These steps reduce the normal risks associated with installing third‑party plugins. If you want higher assurance, request the upstream package source code and checksum from the publisher and review it before install.

Like a lobster shell, security has layers — review code before you run it.

latestvk97fxeeedgnn5swz7engfz5dnd84bcpn

138downloads

1stars

6versions

Updated 1w ago

v1.4.1

MIT-0

AI-Warden Setup

Prompt injection protection for OpenClaw agents. 5 security shields + contamination lockdown.

Publisher: AI-Warden
Source: github.com/ai-warden/openclaw-plugin
NPM: openclaw-ai-warden
Compatibility: OpenClaw 2026.3.14+ and 4.x

Requirements

OpenClaw 2026.3.14+ (including 4.x)
Node.js 18+
npm (bundled with Node.js)

Installation — FOLLOW THESE STEPS IN ORDER

Each step is a separate command. Run them one at a time and verify output before proceeding.

Step 0: Back up config

cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak && echo "BACKUP OK"

Verify: Output is BACKUP OK.

Step 1: Create extension directory

mkdir -p ~/.openclaw/extensions/ai-warden

echo '{"private":true}' > ~/.openclaw/extensions/ai-warden/package.json

Verify: No errors.

Step 2: Install npm package

cd ~/.openclaw/extensions/ai-warden && npm install openclaw-ai-warden@2.4.0

Verify: No errors and no audit warnings. Inspect the installed package before proceeding:

ls node_modules/openclaw-ai-warden/

cat node_modules/openclaw-ai-warden/package.json | grep -E '"name"|"version"'

Confirm the package name is openclaw-ai-warden and version is 2.4.0.

Provenance check — verify the package matches the upstream source:

npm info openclaw-ai-warden repository.url

Expected: https://github.com/ai-warden/openclaw-plugin

npm info openclaw-ai-warden dist.shasum

Compare the shasum with what npm installed:

cat node_modules/openclaw-ai-warden/package.json | grep _shasum

Step 3: Copy plugin files to extension root

OpenClaw loads plugins from the extension directory root, not from node_modules.

cd ~/.openclaw/extensions/ai-warden

cp node_modules/openclaw-ai-warden/index.ts .

cp node_modules/openclaw-ai-warden/openclaw.plugin.json .

cp -r node_modules/openclaw-ai-warden/src .

grep VERSION index.ts | head -1

Verify: Output shows const VERSION = followed by the version number.

Step 4: Configure OpenClaw

This patches openclaw.json to register the plugin. It preserves all existing config (channels, model, gateway settings).

node -e "
const fs = require('fs');
const p = process.env.HOME + '/.openclaw/openclaw.json';
const cfg = JSON.parse(fs.readFileSync(p, 'utf8'));
if (!cfg.plugins) cfg.plugins = {};
cfg.plugins.enabled = true;
if (!cfg.plugins.entries) cfg.plugins.entries = {};
cfg.plugins.entries['ai-warden'] = {
  enabled: true,
  config: {
    layers: { content: 'block', channel: 'warn', preLlm: 'off', toolArgs: 'block', subagents: 'block', output: 'off' },
    sensitivity: 'balanced'
  }
};
fs.writeFileSync(p, JSON.stringify(cfg, null, 2));
console.log('CONFIG OK');
"

Verify: Output is CONFIG OK.

Note: This registers the plugin via plugins.entries only. If you use plugins.allow in your config to restrict which plugins can load, you must add "ai-warden" to that list yourself. If you don't use plugins.allow, no action is needed — the plugin loads automatically from plugins.entries.

Step 5: Add API key (optional)

For online detection (98.9% accuracy vs ~60% offline), add your API key.

Option A — Environment variable (recommended, key not stored in config file):

Set AI_WARDEN_API_KEY in your shell profile or systemd service:

# For systemd (e.g., OpenClaw gateway service):
# Add to your service override: Environment=AI_WARDEN_API_KEY=your_key_here

# For shell:
export AI_WARDEN_API_KEY=your_key_here

Option B — Config file (simpler, key stored in openclaw.json):

node -e "
const fs = require('fs');
const p = process.env.HOME + '/.openclaw/openclaw.json';
const cfg = JSON.parse(fs.readFileSync(p, 'utf8'));
cfg.plugins.entries['ai-warden'].config.apiKey = 'YOUR_API_KEY_HERE';
fs.writeFileSync(p, JSON.stringify(cfg, null, 2));
// Restrict file permissions (config contains API key)
fs.chmodSync(p, 0o600);
console.log('API KEY ADDED (file permissions set to 600)');
"

Replace YOUR_API_KEY_HERE with your actual key from ai-warden.io/signup.

Verify: Output is API KEY ADDED (file permissions set to 600).

Step 6: Restart gateway

openclaw gateway restart

Step 7: Verify installation

After restart, check logs or send /warden command. Expected output:

🛡️ AI-Warden v2.4.0 ready (mode: api|offline, layers: X/6)

mode: api = online detection (98.9% accuracy)
mode: offline = local-only detection (~60% accuracy)

If something breaks, restore config:

cp ~/.openclaw/openclaw.json.bak ~/.openclaw/openclaw.json && openclaw gateway restart

DO NOT

Do NOT use edit tool on openclaw.json — JSON whitespace matching is fragile
Do NOT use config.patch with nested objects — it often fails with format errors
Do NOT skip the cp step — OpenClaw loads from the extension directory, not node_modules
Do NOT restart multiple times — wait at least 15 seconds between restarts
If you use plugins.allow, remember to add "ai-warden" to the list — otherwise the plugin won't load

Updating

cd ~/.openclaw/extensions/ai-warden

npm install openclaw-ai-warden@2.4.0

cp node_modules/openclaw-ai-warden/index.ts .

cp -r node_modules/openclaw-ai-warden/src .

openclaw gateway restart

Security Shields

Shield	Protects against	Default	Mechanism
File Shield 🔴	Poisoned files & web pages	`block`	Scans tool results, injects warning, triggers contamination lockdown on CRITICAL
Chat Shield 🔴	Injections in user messages	`warn`	Scans inbound messages, warns LLM
System Shield ⬛	Full context manipulation	`off`	Scans all messages (expensive, use sparingly)
Tool Shield 🔴	Malicious tool arguments	`block`	Blocks tool execution if arguments contain injection
Agent Shield 🔴	Sub-agent attack chains	`block`	Scans task text of spawned sub-agents

Contamination Lockdown

When File Shield detects a CRITICAL threat (score >500), the session is flagged as contaminated. All dangerous tools (exec, write, edit, message, sessions_send, sessions_spawn, tts) are blocked for the rest of the session. This prevents attack payloads from executing even if the injection bypasses the LLM warning.

Runtime Commands

/warden                      → status overview with all shields
/warden stats                → scan/block counts
/warden shield file block    → set File Shield to block mode
/warden shield chat warn     → set Chat Shield to warn mode
/warden reset                → reset statistics

Detection Modes

Mode	Accuracy	Latency	Cost
Offline (no key)	~60%	<1ms	Free
API (Smart Cascade)	98.9%	~3ms avg	Free tier: 5K calls/month

Get API key: ai-warden.io/signup

Troubleshooting

"plugin not found": openclaw.plugin.json missing from extension dir. Re-run Step 3.
Channels not loading after install: If you use plugins.allow, ensure all your channel plugins (e.g. telegram) are also listed there alongside ai-warden.
False positives on user messages: Set Chat Shield to warn (default) instead of block.
File Shield detects but doesn't block: API key required for reliable blocking (98.9% vs 60%).
Config errors after install: Restore backup: cp ~/.openclaw/openclaw.json.bak ~/.openclaw/openclaw.json
Bot won't start: Check journalctl -u openclaw-gateway -n 20 for actual error.
Workspace files flagged: Plugin auto-whitelists .openclaw/workspace/ and .openclaw/agents/ paths.

Comments

Loading comments...