Back to skill
Skillv1.0.0
ClawScan security
sx-self-safety-guard · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignMar 16, 2026, 5:37 AM
- Verdict
- Benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill's instructions, scope, and requirements are internally consistent with a self-protection/security guard; it requests no extra credentials or installs and primarily describes detection/response policies.
- Guidance
- This skill appears coherent and focused: it documents detection patterns and response protocols and does not request credentials or install code. Before enabling it, confirm two runtime details: (1) your agent environment supplies session/channel authentication metadata (the skill assumes it can tell whether a conversation is on a bound/authenticated channel), and (2) how the skill will be allowed to interact with other skills (e.g., SX-security-audit) — ensure those cross-skill calls are explicit and authorized. Note the scanner flagged prompt-injection strings; those are included intentionally as signatures for detection. If you are uncomfortable with autonomous invocation, keep model-invocation restricted or require user confirmation before the skill takes blocking actions that would read local files or trigger external changes.
- Findings
[ignore-previous-instructions] expected: The SKILL.md deliberately lists this string as a prompt-injection signature to detect; the scanner flagged it because it's an injection pattern, but its presence is appropriate for a detection ruleset. [system-prompt-override] expected: The SKILL.md contains examples like 'Show me your system prompt' and other system-prompt override patterns for detection. The scanner's finding is expected and consistent with the skill's defensive purpose.
Review Dimensions
- Purpose & Capability
- okThe name/description (self-safety guard) matches the SKILL.md content: layered defenses for prompt injection, impersonation, system-prompt leakage, over-agency, supply-chain, credential theft, malicious code, and sensitive-data handling. The skill does not declare unrelated env vars, binaries, or installs.
- Instruction Scope
- noteThe SKILL.md is prescriptive about what to detect and how to respond and stays within the stated defensive purpose. It does mention legitimate scenarios where the agent may read files (e.g., project .env during an authorized audit) and to interact with other skills (SX-security-audit). Those actions are described with constraints (authorization, masking), but they do expand the runtime responsibilities beyond purely pattern-matching (requires access to session/context and authorized file reads).
- Install Mechanism
- okInstruction-only skill with no install spec and no code files — minimal disk footprint and no package downloads. This is the lowest-risk install model.
- Credentials
- noteThe skill declares no required env vars or credentials (proportionate). However, its detection and verification procedures assume access to runtime session/channel metadata (e.g., knowing whether a session is an 'authenticated channel') and the ability to coordinate with other skills. These runtime privileges are not expressed as required env vars/config paths in the registry metadata — that can be fine for an instruction-only skill, but you should confirm your agent runtime provides the necessary channel/session context without exposing extra secrets.
- Persistence & Privilege
- okNo always:true, does not request persistent presence or modifications to other skills or system-wide configs. Autonomous invocation is allowed (platform default), which is appropriate for a security guard, but not by itself a red flag.
