PROMPT INJECTION PROTECTION
ReviewAudited by ClawScan on May 10, 2026.
Overview
This security skill appears purpose-aligned, but its default behavior may mark detected prompt-injection content as safe while also maintaining persistent self-learning state.
Review before installing or relying on this skill. It does not show clear exfiltration or destructive behavior, but its security claims are stronger than the visible implementation: use strict mode, avoid treating its non-strict output as safe, and disable or tightly control adaptive learning and auto-updates unless you understand the persistence behavior.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A user or agent may trust content as safe even though the skill detected prompt-injection patterns in it.
With the default strictMode disabled, detected malicious content is only warned about and can still be returned with isSafe set to true. This materially undercuts the advertised protection behavior.
strictMode = false ... console.warn('Potential prompt injection detected:', injectionCheck); ... isSafe: !injectionCheck.isMalicious || !strictModeFail closed by default for detected injections, make strict mode the default, correct the isSafe logic, and clearly document that sanitization is limited.
Malicious or noisy input could persistently alter the skill's future behavior, causing false positives or unreliable protection across sessions.
Untrusted content can generate learned patterns that are saved to disk and reused to change future detection behavior, without clear user approval, retention limits, or scope controls in SKILL.md.
this.threatDbPath = path.join(__dirname, 'learned-threats.json'); ... await this.saveLearnedThreats(); ... this.protection.injectionPatterns.push(...threatInfo.pattern);
Make adaptive learning opt-in, scope learned data per project, provide review and reset controls, and document what is stored and for how long.
The skill may keep a runtime active and periodically mutate its detection patterns without an explicit per-use prompt.
Instantiating the protection class starts a recurring auto-update mechanism by default. The provided code shows simulated local updates rather than network downloads or command execution, but it is still background behavior users should know about.
// Start auto-updates by default this.autoUpdateSystem.startAutoUpdates();
Make auto-updates opt-in, expose a clear stop control, and disclose the background timer in SKILL.md and metadata.
