Back to skill
v1.0.1

Hallucination Guard

BenignClawScan verdict for this skill. Analyzed May 1, 2026, 8:22 AM.

Analysis

This instruction-only guardrail is coherent and transparent, with the main caution that it can be configured to run local and network verification checks automatically.

GuidanceThis appears safe to install as an instruction-only verification aid if you are comfortable with the agent running validation commands. Prefer manual invocation or narrow scopes for sensitive projects, and review before allowing checks on private paths, internal URLs, or package-based tools.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Abnormal behavior control

Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.

Agent Goal Hijack
SeverityLowConfidenceHighStatusNote
SKILL.md
Add to the system prompt so the agent automatically runs this skill before completing a task: ... Before completing any task, you must run all 14 checks ... and fix any FAIL items.

This deliberately changes the agent’s stopping condition and makes the guardrail run automatically if the user adopts the prompt.

User impactIf enabled globally, the agent may spend extra time running checks and making corrections before completing unrelated tasks.
RecommendationUse the auto-run prompt only for workflows where constant verification is wanted; otherwise invoke the skill manually or scope it to specific targets.
Tool Misuse and Exploitation
SeverityLowConfidenceHighStatusNote
SKILL.md
stat <path>; ls -la <path>; curl -sI --max-time 5 <url>; python3 -c ...; node --check target.js; npx tsc --noEmit target.ts

The skill instructs the agent to use shell and network-facing commands to verify paths, URLs, and code syntax. This is central to its purpose but should be scoped to intended targets.

User impactThe agent may read metadata about local files, check installed commands, and make outbound HEAD requests to referenced URLs.
RecommendationRun it against known project files and URLs, and ask for confirmation before checking sensitive paths, internal URLs, or invoking package-based tools such as npx.