Hallucination Guard
Analysis
This instruction-only guardrail is coherent and transparent, with the main caution that it can be configured to run local and network verification checks automatically.
Findings (2)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.
Add to the system prompt so the agent automatically runs this skill before completing a task: ... Before completing any task, you must run all 14 checks ... and fix any FAIL items.
This deliberately changes the agent’s stopping condition and makes the guardrail run automatically if the user adopts the prompt.
stat <path>; ls -la <path>; curl -sI --max-time 5 <url>; python3 -c ...; node --check target.js; npx tsc --noEmit target.ts
The skill instructs the agent to use shell and network-facing commands to verify paths, URLs, and code syntax. This is central to its purpose but should be scoped to intended targets.
