Prompt Injection Guard
PassAudited by ClawScan on May 1, 2026.
Overview
This instruction-only security guard is broadly aligned with prompt-injection defense, with user-noticeable behavior around automatic blocking, limited input logging, and a small provenance metadata mismatch.
Install this if you want an automatic prompt-injection guard, but expect it to block or question some prompts. Before relying on it, confirm the publisher metadata and decide how any recommended security logs will be stored and redacted.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Legitimate security, crypto, or moderation discussions could be blocked or require confirmation if they match the guard's patterns.
The skill intentionally changes stopping conditions by instructing the agent to block high-risk prompts. This matches the defensive purpose, but it can affect whether user requests are allowed to proceed.
Level 3 - ブロック ... 対応: 「🚫 ブロック: セキュリティ上の理由で実行できません」 ... 続行: 不可
Use this where automatic prompt-injection guarding is desired, and review or tune the rules if false positives would disrupt normal work.
The mismatch may make it harder to confirm who published or packaged the skill.
The registry metadata lists a different owner ID, "kn72e4q61j807j4a90azhayc0d80wa2j". Because the skill is instruction-only with no code or install steps, this is a provenance note rather than evidence of malicious behavior.
"ownerId": "kn70ts53f8e9rzfwp5t12d6ta180q6f3"
Confirm the publisher/version in ClawHub before relying on it, and ask the maintainer to align registry and package metadata.
If implemented, fragments of user prompts could be retained in logs.
The skill recommends logging detected attack patterns, response outcomes, timestamps, and part of the input. It says to exclude sensitive parts, but does not define retention, storage, or access controls.
記録項目: - 検出した攻撃パターン - 入力の一部(機密部分除く) - 対応結果 - タイムスタンプ
Define where logs are stored, how long they are kept, who can see them, and ensure secrets or personal data are redacted.
