Anti-Injection-Skill

Security checks across malware telemetry and agentic risk

Overview

Prompt-injection indicators were detected in the submitted artifacts (ignore-previous-instructions, you-are-now, system-prompt-override); human review is required before treating this skill as clean.

This looks acceptable for a defensive prompt-injection skill, but install it only if you want it to act as a top-priority security gatekeeper. Review thresholds to reduce false positives, prefer local semantic analysis for private data, protect or redact audit logs, verify any install scripts before running them, and do not rely on the advertised detection rates without your own testing. ClawScan detected prompt-injection indicators (ignore-previous-instructions, you-are-now, system-prompt-override), so this skill requires review even though the model response was benign.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

#
ASI01: Agent Goal Hijack
Low
What this means

Legitimate requests may be blocked or delayed if the detector misclassifies them.

Why it was flagged

The skill intentionally makes itself a high-priority gatekeeper for all agent activity. That matches its defensive purpose, but it can override normal task flow if thresholds or patterns are too aggressive.

Skill content
**⚠️ ALWAYS RUN BEFORE ANY OTHER LOGIC** ... EVERY user input ... EVERY tool output ... BEFORE any plan formulation ... BEFORE any tool execution
Recommendation

Enable it deliberately, tune thresholds for your use case, and keep a clear recovery or bypass process for false positives.

#
ASI06: Memory and Context Poisoning
Low
What this means

Security logs or score state could preserve sensitive tool metadata or cause stricter behavior after repeated suspicious-looking inputs.

Why it was flagged

The skill maintains security score state and writes audit logs. This is normal for monitoring, but the artifacts do not specify log retention, redaction, or reset boundaries.

Skill content
Warning | Increased scrutiny, log all tool calls ... <40 | 🔒 LOCKDOWN ... Log to AUDIT.md + Alert if needed
Recommendation

Store audit logs with appropriate permissions, avoid logging secrets, define retention, and provide an administrator-controlled reset path.

#
ASI07: Insecure Inter-Agent Communication
Medium
What this means

If API mode is enabled, user messages or tool-output text sent for analysis may leave the local environment.

Why it was flagged

API semantic-analysis mode may send analyzed inputs to an external model provider. The option is disclosed and local mode is recommended, but users should understand the data boundary.

Skill content
Uses Claude/OpenAI API for embeddings.  
**Cost:** ~$0.0001 per query
Recommendation

Use local semantic mode for private data, or review provider privacy terms and limit what content is sent in API mode.

#
ASI04: Agentic Supply Chain Vulnerabilities
Low
What this means

Installing packages this way could change the host Python environment or pull unexpected dependency versions.

Why it was flagged

The optional setup installs unpinned packages and uses a flag that can modify the system Python environment. This is user-directed and purpose-aligned for local semantic analysis, but it carries normal package-supply-chain and environment-stability risk.

Skill content
pip install sentence-transformers numpy --break-system-packages
Recommendation

Prefer a virtual environment or container, pin dependency versions, and inspect any included scripts before running them.

#
ASI09: Human-Agent Trust Exploitation
Info
What this means

Users may overestimate the protection level if they treat marketing claims as a guarantee.

Why it was flagged

The artifacts make strong security-effectiveness claims. They may be true, but the provided excerpts do not substantiate them with test data or reproducible benchmarks.

Skill content
Production-ready ... Blocks ~98% of attacks. <2% false positives. 50ms overhead.
Recommendation

Test the skill against your own threat model and keep other security controls in place.