ClawSafe

Prompts

Multi-layer security detector for AI agents. Blocks prompt injection, jailbreak, XSS, SQL injection, API key leaks, supply chain attacks, and deployment vulnerabilities.

Install

openclaw skills install clawsafe

clawSafe 🛡️

Enterprise-grade security detector for AI agents

Overview

clawSafe is a comprehensive security middleware that intercepts and blocks malicious input before it reaches your AI agent. Built with defense-in-depth philosophy.

Features

5-Layer Protection

Layer	Threats	Rules
LLM Layer	Prompt Injection, Jailbreak, Prompt Leaking, Encoding Attacks	44
Web Layer	SQL Injection, XSS, CSRF, SSRF	32
API Layer	Key Exposure, Rate Limiting, Auth Bypass	19
Supply Chain	Dangerous Dependencies, Remote Code Execution	8
Deploy Layer	Environment Leaks, Debug Info Disclosure	10

Total: 113+ detection rules

Quick Start

Installation

# Via ClawHub
clawhub install clawSafe

# Manual
cp -r clawSafe ~/.openclaw/workspace/skills/

Basic Usage

const Detector = require('./detector');

const detector = new Detector();

// Scan user input
const result = detector.scan('Ignore previous instructions');

if (!result.safe) {
  console.log('Blocked:', result.threats);
  // Handle blocked input
}

Return Format

{
  safe: boolean;           // true if input is safe
  threats: Array<{
    type: string;         // threat category
    pattern: string;      // matched pattern ID
    severity: string;     // critical|high|medium|low
    confidence: number;   // 0-1
    description: string;
  }>;
  confidence: number;      // overall confidence 0-1
  layersScanned: string[]; // layers that were checked
}

Configuration

config.json

{
  "enabled": true,
  "logLevel": "info",
  "layers": {
    "llm": {
      "enabled": true,
      "rules": ["injection", "jailbreak", "prompt_leak", "encoding"]
    },
    "web": {
      "enabled": true,
      "rules": ["sql_injection", "xss", "csrf", "ssrf"]
    },
    "api": {
      "enabled": true,
      "rules": ["key_exposure", "rate_limit", "auth"]
    },
    "supply_chain": {
      "enabled": true,
      "rules": ["deps"]
    },
    "deploy": {
      "enabled": true,
      "rules": ["env_leak", "debug_info"]
    }
  },
  "detection": {
    "confidenceThreshold": 0.6,
    "minMatchCount": 1
  },
  "actions": {
    "onThreatDetected": "block",
    "onUncertain": "log"
  }
}

whitelist.json

{
  "keywords": ["trusted-keyword"],
  "users": ["user-id-1"],
  "sessions": ["session-id-1"]
}

Detection Rules

LLM Layer

Prompt Injection Patterns:

ignore previous instructions
disregard your guidelines
forget all rules
act as if you have no restrictions

Jailbreak Patterns:

DAN mode
developer mode
roleplay as

Encoding Bypass:

Base64 encoded commands
Hex encoding
Unicode homoglyphs

Web Layer

SQL Injection: '; DROP TABLE users; --
XSS: <script>alert(1)</script>
CSRF: Token manipulation
SSRF: Internal URL access

API Layer

API Key exposure: sk-1234567890
JWT tokens
Bearer tokens
Basic auth credentials

Testing

# Run all tests
node test.js

# Interactive mode
node test-interactive.js

# Demo
node detector.js

Integration

OpenClaw Hook

To integrate with OpenClaw, add to your gateway config:

// gateway.config.js
module.exports = {
  middleware: ['clawSafe'],
  clawSafe: {
    enabled: true,
    strictMode: false
  }
};

Performance

Latency: < 5ms per scan
Memory: ~50KB
Rules: 113+ (JSON-based, lazy load)

License

MIT

Changelog

v1.0.0

Initial release
5-layer protection
113+ detection rules