PROMPT INJECTION PROTECTION

ReviewAudited by ClawScan on May 10, 2026.

Overview

This security skill appears purpose-aligned, but its default behavior may mark detected prompt-injection content as safe while also maintaining persistent self-learning state.

Review before installing or relying on this skill. It does not show clear exfiltration or destructive behavior, but its security claims are stronger than the visible implementation: use strict mode, avoid treating its non-strict output as safe, and disable or tightly control adaptive learning and auto-updates unless you understand the persistence behavior.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

ConcernHigh Confidence

ASI09: Human-Agent Trust Exploitation

What this means

A user or agent may trust content as safe even though the skill detected prompt-injection patterns in it.

Why it was flagged

With the default strictMode disabled, detected malicious content is only warned about and can still be returned with isSafe set to true. This materially undercuts the advertised protection behavior.

Skill content

strictMode = false ... console.warn('Potential prompt injection detected:', injectionCheck); ... isSafe: !injectionCheck.isMalicious || !strictMode

Recommendation

Fail closed by default for detected injections, make strict mode the default, correct the isSafe logic, and clearly document that sanitization is limited.

ConcernHigh Confidence

ASI06: Memory and Context Poisoning

What this means

Malicious or noisy input could persistently alter the skill's future behavior, causing false positives or unreliable protection across sessions.

Why it was flagged

Untrusted content can generate learned patterns that are saved to disk and reused to change future detection behavior, without clear user approval, retention limits, or scope controls in SKILL.md.

Skill content

this.threatDbPath = path.join(__dirname, 'learned-threats.json'); ... await this.saveLearnedThreats(); ... this.protection.injectionPatterns.push(...threatInfo.pattern);

Recommendation

Make adaptive learning opt-in, scope learned data per project, provide review and reset controls, and document what is stored and for how long.

NoteHigh Confidence

ASI10: Rogue Agents

What this means

The skill may keep a runtime active and periodically mutate its detection patterns without an explicit per-use prompt.

Why it was flagged

Instantiating the protection class starts a recurring auto-update mechanism by default. The provided code shows simulated local updates rather than network downloads or command execution, but it is still background behavior users should know about.

Skill content

// Start auto-updates by default
this.autoUpdateSystem.startAutoUpdates();

Recommendation

Make auto-updates opt-in, expose a clear stop control, and disclose the background timer in SKILL.md and metadata.