Install
openclaw skills install impostor-huntHunt the impostor in a "finished" deliverable. An impostor is a result-correct artifact whose causal chain is NOT the one the user's purpose required — the test set leaked, only the happy path is wired, correlation got relabeled as cause, the config file exists but no code path reads it, the wrapper greps stdout for "OK" without checking the exit code, the agent "can refactor" backed by one cherry-picked example. The output looks right. The logic that produced "right" is the wrong logic. AUTO-INVOKE when a turn declares completion AND the user's original purpose is recoverable from context. Completion signals include: "done", "implemented", "finished", "works now", "ready to merge", "all tests pass", "task complete", "feature shipped", an agent posting a summary with checkmarks, a PR description, a CHANGELOG entry, or a closing message that asserts the work is over. Original purpose is recoverable from: the user's first message in the thread, the issue/ticket text, the PR title and description, the commit message that opened the branch, the README's stated goal, or an explicit prompt the user pasted at the start. If both signals are present, invoke without waiting for an explicit request. Also invoke when the user explicitly asks to audit completion truth, detect fake completion, check for hidden goal misalignment, look for mock-driven success, test-set leakage, happy-path-only delivery, correlation-as-causation, configured-but-unread settings, logs-say-success-but-return-code-unchecked, or similar patterns. Triggers in any language; the report mirrors the user's language. DO NOT invoke for ordinary bug hunting (route to code-review), style or refactor cleanup (route to simplify), or behavior-by-execution verification (route to verify / run). DO NOT invoke when the user is mid-implementation and has not yet declared the work done — interrupting an in-flight task with a completion audit is noise. DO NOT invoke when the original purpose is not recoverable from context; ask for it first instead of guessing.
openclaw skills install impostor-huntHunt the impostor. Spare the honest work.
Surface-correct is not correct. A passing test is a hypothesis, not a verdict. This skill exists to catch one specific failure mode and refuse to be distracted by anything else.
The user — or the moment — wants to detect a causal impostor.
A causal impostor is a deliverable whose output is surface-correct, but the causal chain that produced it is not the one the user's purpose requires. It is not a bug. It is a cheaper causal chain wearing the costume of the expensive one the user actually asked for.
This is the primary line of the audit. It stays sharp.
Non-impersonation problems (ordinary bugs, design smells, improvable code) may also be noticed in passing. These are the secondary line. They go into a separate section, clearly labeled as not impersonation. They never enter the verdict.
This skill is built to auto-trigger, not to wait for permission.
If both hold, fire. Do not ask permission. Do not wait for /impostor-hunt. The whole point is to catch impostors before the user has to suspect one.
code-review.simplify.verify / run.Read in this priority order, stop at first hit:
Capture the longest contiguous purpose statement you can find. Quote it verbatim in Step 0. Inferred clauses must be marked [inferred] and must be one-sentence-overrideable.
Before anything else, write this block. It is the spine of the whole audit.
- Trigger: (auto / explicit) — if auto, quote the completion signal
- Artifact type: (repo / code snippet / report / config / agent output / claim only)
- Access level: (full source / partial / claim-only)
- User's purpose, verbatim: (exact words, from the priority order above)
- Purpose source: (which of the 5 recovery channels)
- Delivery's stated purpose: (from README / summary / naming / output)
- Wording delta: (any difference between the two, however small)
- Audit budget: (shallow / standard / deep)
The wording delta is the single most important field. Impersonation almost always lives in the gap between these two strings — e.g. "predicts accurately" vs "predicts accurately on the test set", "handles orders" vs "handles orders in the happy path", "config is applied" vs "config file exists".
Reconstruct the user's real purpose. Then self-check:
If purpose restoration is wrong, every later step is wrong. This self-check is non-negotiable.
Decompose "completed" into completion claims.
Auditability rule: a claim is auditable only if it is falsifiable by a concrete observation (a file's content, a command's output, a value at a specific input). "Code quality is good" is not a claim. "process_order() raises ValueError on negative amount" is.
Discard non-auditable claims. Do not pad the table.
For each main-line claim, fill the audit table from references/report-template.md. Use evidence levels A/B/C/D/E from references/evidence-model.md.
Side-line observations do not go through this table. They have their own section in Step 9.
List the assumptions the artifact silently depends on. Mark each as Supported / Risky / Unknown. Impersonation often hides as an unstated assumption (e.g. "the input distribution at runtime matches training", "the mock matches the real service's error semantics").
Use references/audit-lenses.md.
Findings from main-line lenses go into the main audit. Findings from side-line lenses go into the side section.
This is the core step. Walk through references/causal-impersonation-patterns.md and, for each pattern relevant to the artifact type, ask:
For every main-line claim flagged as purpose-critical in Step 3, this trio must be filled in — even if the conclusion is "no impostor here". Recording the negative result is what makes "Causally aligned" a credible verdict instead of an empty shrug.
Use references/tribunal.md. Three roles only:
Each role outputs its strongest 1–3 points only. No role-play prose.
Do both:
Both branches must be written. Skipping one biases the verdict.
Use references/report-template.md. The verdict is only about impersonation, drawn from references/verdict-rubric.md. Side findings live in their own section and never change the verdict.
If you auto-invoked, the report opens with the trigger line from Step 0 — the user sees what made you fire before they see what you found.
count(side_findings) ≤ 2 × count(main_line_positions_checked)
If exceeded, drop the side findings with the weakest evidence until the ratio holds. Report the cut ("N side findings dropped to preserve main-line focus").
Default to English. If the user's invocation, or the recovered purpose, is in another language, mirror that language in the report. The protocol itself (step names, verdict tier labels, evidence levels) stays in English so cross-language reports remain comparable.
code-review.