Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Prompt Injection Defense

v1.0.0

Harden agent sessions against prompt injection from untrusted content. Use when the agent reads web search results, emails, downloaded files, PDFs, or any ex...

0· 51·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the provided assets: SKILL.md documents tagging, scanning, memory guardrails and canaries; scripts implement scanning (scan-content.py), safe memory writes (safe-memory-write.sh), and tagging (tag-untrusted.sh). No unrelated credentials, binaries, or install steps are requested.
Instruction Scope
Runtime instructions are focused on scanning/tagging/quarantine. tag-untrusted.sh runs an arbitrary command and echoes its output wrapped in tags — this is expected for capturing tool output, but be careful: do not pass untrusted user-supplied strings as executable commands (that would execute them). The SKILL.md itself contains the injection phrases the scanner looks for (hence pre-scan hits); this is expected because the doc teaches detection rules.
Install Mechanism
Instruction-only with small local scripts; no download/install mechanism, package managers, or network fetches embedded in the install. Low installation risk.
Credentials
The skill requests no credentials or required env vars. Scripts write to a workspace path (OPENCLAW_WORKSPACE or default $HOME/.openclaw/workspace) and create memory/quarantine files there — this is consistent with purpose but means the skill will create persistent files on the user's filesystem and may store sanitized or quarantined copies of untrusted content (which could include secrets if such content contained them).
Persistence & Privilege
always:false (not force-installed) and user-invocable:true. The skill writes its own memory/quarantine files (expected). It does not modify other skills or request elevated system privileges.
Scan Findings in Context
[ignore-previous-instructions] expected: The SKILL.md intentionally documents that phrase as a canary pattern; pre-scan flagged it because the skill is teaching detection of that exact injection vector.
[system-prompt-override] expected: SKILL.md and references include examples like 'SYSTEM PROMPT' and 'system:' as high-confidence triggers; detection here is expected and benign.
Assessment
This skill appears to do what it says: tag untrusted outputs, scan them for prompt-injection patterns, and quarantine or accept content before writing to memory. Before installing, consider: (1) set OPENCLAW_WORKSPACE explicitly if you don't want files in your home directory; review filesystem permissions on that workspace. (2) Do not allow the agent to construct shell commands from untrusted input and then pass them to tag-untrusted.sh (that script will execute whatever command you give it). (3) Regularly review the quarantine directory for false positives and for any sensitive data captured there. (4) Treat the scanner as a defense-in-depth tool — it can miss sophisticated attacks; combine with read-only API permissions and human review for risky actions. If you want higher assurance, audit the scripts locally and run them in a sandboxed environment first.
!
references/canary-patterns.md:9
Prompt-injection style instruction pattern detected.
!
SKILL.md:33
Prompt-injection style instruction pattern detected.
About static analysis
These patterns were detected by automated regex scanning. They may be normal for skills that integrate with external APIs. Check the VirusTotal and OpenClaw results above for context-aware analysis.

Like a lobster shell, security has layers — review code before you run it.

latestvk970h18f52bjfpz32xbeateyvn83raps

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments