Skill Defender
Scans installed OpenClaw skills for malicious patterns including prompt injection, credential theft, data exfiltration, obfuscated payloads, and backdoors. Use when installing new skills, after skill updates, or for periodic security scans. Runs deterministic pattern matching — fast, offline, no API cost.
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
Skill Defender — Malicious Pattern Scanner
When to Run
Automatic Triggers
- New skill installed — Immediately run
scan_skill.pyagainst it before allowing use - Skill updated — Re-scan after any file changes in a skill directory
- Periodic audit — Run batch scan on all installed skills when requested
Manual Triggers
- User says "scan skill X" → scan that specific skill
- User says "scan all skills" → batch scan all skills
- User says "security check" or "audit skills" → same as above
Scripts
scripts/scan_skill.py — Single Skill Scanner
Scans one skill directory for malicious patterns. Produces JSON or human-readable output.
scripts/aggregate_scan.py — Batch Scanner
Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.
How to Run
# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir
# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json
# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py
# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills
# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose
# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"
Exit Codes (scan_skill.py)
0= clean or informational only1= suspicious (medium/high findings)2= dangerous (critical findings)3= error
Output Format (aggregate_scan.py)
{
"skills": [
{
"name": "skill-name",
"verdict": "clean|suspicious|dangerous|error",
"findingsCount": 0,
"findings": []
}
],
"summary": "All 37 skills passed with no significant issues.",
"totalSkills": 37,
"cleanCount": 37,
"suspiciousCount": 0,
"dangerousCount": 0,
"errorCount": 0,
"timestamp": "2026-02-02T06:00:00+00:00"
}
Auto-Detection
Both scripts auto-detect paths:
- Skills directory: Detected from script location (walks up to find
skills/parent), falls back to~/clawd/skills,~/skills,~/.openclaw/skills - Scanner script:
aggregate_scan.pyfindsscan_skill.pyco-located in the same directory
Handling Results
✅ Clean (verdict: "clean")
- No action needed — skill is safe
⚠️ Suspicious (verdict: "suspicious")
- Warn the user with a summary of findings
- Show the category and severity of each finding
🚨 Dangerous (verdict: "dangerous")
- Block the skill — do not proceed with installation or use
- Show the full detailed findings to the user
- Require explicit user override to proceed
Built-in Allowlist
The aggregate scanner includes an allowlist for known false positives:
- Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
- Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
- Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
- Agent-writing skills (self-improving-agent) — designed to modify agent files
Pattern Reference
See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.
Important Notes
- No external dependencies — standard library only (Python 3.9+)
- Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
- This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
- False positives are possible — the allowlist and
--excludeflag help - The scanner will flag itself if scanned without the allowlist — this is expected
Files
4 totalComments
Loading comments…
