Skill Defender

Scans installed OpenClaw skills for malicious patterns including prompt injection, credential theft, data exfiltration, obfuscated payloads, and backdoors. Use when installing new skills, after skill updates, or for periodic security scans. Runs deterministic pattern matching — fast, offline, no API cost.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
5 · 1.9k · 6 current installs · 6 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (malicious-pattern scanner) align with the included artifacts: SKILL.md, a threat-patterns reference, and two Python scanner scripts. No unrelated env vars, binaries, or install steps are requested. The scripts' behavior (walking skill dirs, regex-based detections, aggregating results) is coherent with a scanner.
Instruction Scope
SKILL.md instructs scanning single skills or all installed skills and documents auto-detection of the skills directory (searching ~/.clawd/skills, ~/skills, ~/.openclaw/skills and walking up from the script). It also documents allowlisting and output handling. The SKILL.md contains explicit prompt-injection examples (e.g., "ignore previous instructions", "you are now") which triggered pre-scan flags — this is expected because the scanner documents the patterns it detects. The scanner will read skill files (required for its purpose); verify you are comfortable with a local tool reading installed skill files (these files can contain secrets).
Install Mechanism
Instruction-only with bundled Python scripts; no install spec, no downloads, no external packages required (scripts state standard library only). No evidence of downloading/executing remote payloads in the provided code.
Credentials
The skill declares no required environment variables, no primary credential, and no config paths. The code includes regexes that look for credential paths in scanned skills (expected) but the scanner itself does not request or require secrets.
Persistence & Privilege
No always:true, no automatic modification of agent configuration is described. The tool scans files and produces reports; it includes an allowlist stored in the script (normal). There is no code shown that writes to core agent files or modifies other skills' configurations.
Scan Findings in Context
[ignore-previous-instructions] expected: SKILL.md and references intentionally include 'ignore previous instructions' as an example pattern the scanner detects; pre-scan detection of this phrase is expected and appropriate.
[you-are-now] expected: SKILL.md documents 'you are now' as a prompt-injection pattern; detection of this phrase in the docs is expected and consistent with the skill's purpose.
Assessment
This skill appears internally consistent: it's an offline, deterministic pattern scanner implemented in Python that reads skill directories and reports findings. Before installing or running it, consider these points: (1) Source provenance — the skill's owner and homepage are unknown; prefer code from a trusted publisher or review the code yourself. (2) Local file access — the scanner will read all files in your skills directory (which is necessary for scanning). If your skills contain sensitive secrets, consider auditing those files separately or run the scanner in a controlled environment. (3) Allowlist/false positives — the tool includes a built-in allowlist that can suppress findings; review that allowlist to ensure it isn’t silencing legitimate issues. (4) No network I/O is visible in the provided code, but always review the full scripts before running. If you cannot inspect the code, run it in a sandboxed environment or a VM and verify behavior (stdout, exit codes) on a non-production copy of your skills directory.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk970fbfyrjsh3m7w294w3dc5h980c93y

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Skill Defender — Malicious Pattern Scanner

When to Run

Automatic Triggers

  1. New skill installed — Immediately run scan_skill.py against it before allowing use
  2. Skill updated — Re-scan after any file changes in a skill directory
  3. Periodic audit — Run batch scan on all installed skills when requested

Manual Triggers

  • User says "scan skill X" → scan that specific skill
  • User says "scan all skills" → batch scan all skills
  • User says "security check" or "audit skills" → same as above

Scripts

scripts/scan_skill.py — Single Skill Scanner

Scans one skill directory for malicious patterns. Produces JSON or human-readable output.

scripts/aggregate_scan.py — Batch Scanner

Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.

How to Run

# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir

# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json

# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py

# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills

# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose

# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"

Exit Codes (scan_skill.py)

  • 0 = clean or informational only
  • 1 = suspicious (medium/high findings)
  • 2 = dangerous (critical findings)
  • 3 = error

Output Format (aggregate_scan.py)

{
  "skills": [
    {
      "name": "skill-name",
      "verdict": "clean|suspicious|dangerous|error",
      "findingsCount": 0,
      "findings": []
    }
  ],
  "summary": "All 37 skills passed with no significant issues.",
  "totalSkills": 37,
  "cleanCount": 37,
  "suspiciousCount": 0,
  "dangerousCount": 0,
  "errorCount": 0,
  "timestamp": "2026-02-02T06:00:00+00:00"
}

Auto-Detection

Both scripts auto-detect paths:

  • Skills directory: Detected from script location (walks up to find skills/ parent), falls back to ~/clawd/skills, ~/skills, ~/.openclaw/skills
  • Scanner script: aggregate_scan.py finds scan_skill.py co-located in the same directory

Handling Results

✅ Clean (verdict: "clean")

  • No action needed — skill is safe

⚠️ Suspicious (verdict: "suspicious")

  • Warn the user with a summary of findings
  • Show the category and severity of each finding

🚨 Dangerous (verdict: "dangerous")

  • Block the skill — do not proceed with installation or use
  • Show the full detailed findings to the user
  • Require explicit user override to proceed

Built-in Allowlist

The aggregate scanner includes an allowlist for known false positives:

  • Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
  • Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
  • Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
  • Agent-writing skills (self-improving-agent) — designed to modify agent files

Pattern Reference

See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.

Important Notes

  • No external dependencies — standard library only (Python 3.9+)
  • Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
  • This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
  • False positives are possible — the allowlist and --exclude flag help
  • The scanner will flag itself if scanned without the allowlist — this is expected

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…