Install
openclaw skills install wreckit-ralphBulletproof AI code verification. The agent IS the engine — no external tools required. Spawns parallel verification workers that slop-scan, type-check, muta...
openclaw skills install wreckit-ralphBuild it. Break it. Prove it works.
AI can't verify itself. Structure the pipeline so it can't silently agree with itself. Separate Builder/Tester/Breaker roles across fresh contexts. Use independent oracles.
Full 14-step framework:
references/verification-framework.md
Auto-detected from context:
| Mode | Trigger | Description |
|---|---|---|
| 🟢 BUILD | Empty repo + PRD | Full pipeline for greenfield |
| 🟡 REBUILD | Existing code + migration spec | BUILD + behavior capture + replay |
| 🔴 FIX | Existing code + bug report | Fix, verify, check regressions |
| 🔵 AUDIT | Existing code, no changes | Verify and report only |
Read the gate file before executing it. Each contains: question, checks, pass/fail criteria.
| Gate | BUILD | REBUILD | FIX | AUDIT | File |
|---|---|---|---|---|---|
| AI Slop Scan | ✅ | ✅ | ✅ | ✅ | references/gates/slop-scan.md |
| Type Check | ✅ | ✅ | ✅ | ✅ | references/gates/type-check.md |
| Ralph Loop | ✅ | ✅ | ✅ | ❌ | references/gates/ralph-loop.md |
| Test Quality | ✅ | ✅ | ✅ | ✅ | references/gates/test-quality.md |
| Mutation Kill | ✅ | ✅ | ✅ | ✅ | references/gates/mutation-kill.md |
| Cross-Verify | ✅ | ❌ | ❌ | ❌ | references/gates/cross-verify.md |
| Behavior Capture | ❌ | ✅ | ❌ | ❌ | references/gates/behavior-capture.md |
| Regression | ❌ | ✅ | ✅ | ❌ | references/gates/regression.md |
| SAST | ❌ | ❌ | ✅ | ✅ | references/gates/sast.md |
| LLM-as-Judge | opt | opt | opt | opt | references/gates/llm-judge.md |
| Design Review | ❌ | ❌ | ❌ | ✅ | references/gates/design-review.md |
| CI Integration | ✅ | ✅ | ❌ | ✅ | references/gates/ci-integration.md |
| Proof Bundle | ✅ | ✅ | ✅ | ✅ | references/gates/proof-bundle.md |
Deterministic helpers — run these, don't rewrite them:
Core (all modes):
scripts/project-type.sh [path] — classify project context + calibration profile (skip_gates, thresholds, tolerated warns)scripts/detect-stack.sh [path] — auto-detect language, framework, test runner → JSONscripts/check-deps.sh [path] — verify all deps exist in registries (hallucination check)scripts/slop-scan.sh [path] — semantic slop scan (tracked vs untracked debt, categorized output) → JSONscripts/type-check.sh [path] — run type checker (tsc/mypy/cargo/go vet) → JSONscripts/ralph-loop.sh [path] — validate IMPLEMENTATION_PLAN.md structure → JSONscripts/coverage-stats.sh [path] — extract raw coverage numbers from test runnerscripts/mutation-test.sh [path] [test-cmd] — mutation testing (mutmut/cargo-mutants/Stryker/AI)scripts/mutation-test-stryker.sh [path] — Stryker-specific mutation testing → JSONscripts/red-team.sh [path] — SAST + 20+ vulnerability patterns → JSONscripts/regex-complexity.sh [path] [--context library|app] — targeted ReDoS analysis → JSONscripts/proof-bundle.sh [path] [mode] — corroboration-based aggregation + proof bundle writerscripts/run-all-gates.sh [path] [mode] [--log-file] — sequential gate runner with telemetry + adaptive skipping/toleranceMode-specific:
scripts/behavior-capture.sh [path] — capture golden fixtures before rebuild (REBUILD)scripts/design-review.sh [path] — dep graph, coupling, circular deps (AUDIT/REBUILD) → JSONscripts/ci-integration.sh [path] — CI config detection and scoring → JSONscripts/differential-test.sh [path] — oracle comparison, golden tests (BUILD/REBUILD) → JSONExtended verification:
scripts/dynamic-analysis.sh [path] — memory leaks, race conditions, FD leaks → JSONscripts/perf-benchmark.sh [path] — benchmark detection + regression vs baseline → JSONscripts/property-test.sh [path] — property-based/fuzz testing, generates stubs → JSONBootstrap:
scripts/run-audit.sh [path] [mode] [--spawn] — generate orchestrator task + optional spawnFor multi-gate parallel execution, read references/swarm/orchestrator.md.
Quick overview:
Main agent → wreckit orchestrator (depth 1)
├─ Planning: Architect worker
├─ Building: Sequential Implementer workers
├─ Verification: Parallel gate workers
├─ Sequential: Cross-verify / regression / judge
└─ Decision: Proof bundle → Ship / Caution / Blocked
Critical: Read references/swarm/collect.md before spawning workers.
Never fabricate results. Wait for all workers to report back.
Worker output format: references/swarm/handoff.md.
Config required:
{ "agents.defaults.subagents": { "maxSpawnDepth": 2, "maxChildrenPerAgent": 8 } }
| Verdict | Criteria |
|---|---|
| Ship ✅ | No hard blocks; no corroborated multi-domain fail evidence above block threshold |
| Caution ⚠️ | Single non-hard fail, warning-only risk, or corroboration below block threshold |
| Blocked 🚫 | Any hard block OR corroborated non-hard failure pattern (multi-signal, multi-domain, high-confidence) |
Hard-block + corroboration rule details: references/gates/corroboration.md
| Language | Gates Available | Notes |
|---|---|---|
| TypeScript/JS | 11/11 | Full support via Stryker, tsc, vitest/jest |
| Python | 11/11 | Full support via mutmut, mypy/pyright, pytest |
| Rust | 11/11 | Full support via cargo-mutants, cargo check/test |
| Go | 11/11 | Full support via go vet, go test |
| Swift (SPM) | 9/11 | mutation = AI-estimated CAUTION, cross-verify = manual |
| Swift (Xcode) | 7/11 | type-check = xcodebuild, mutation = AI-estimated, coverage = limited |
| iOS apps | 7/11 | Same as Xcode projects |
| Java/Kotlin | 10/11 | Gradle/Maven, mutation via PIT (manual setup) |
| Shell | 8/11 | shellcheck, limited mutation testing |
CAUTION, never SHIP.swift build (the compiler IS the type checker).xcodebuild with auto-detected schemes.pod outdated is checked if Podfile present.For small projects or when swarm isn't needed, run gates sequentially:
scripts/detect-stack.sh → know your target (language, test cmd, type checker)scripts/check-deps.sh → verify deps are real (not hallucinated)scripts/slop-scan.sh → find placeholders, template artifacts, empty stubsreferences/gates/type-check.mdscripts/coverage-stats.sh → references/gates/test-quality.mdscripts/mutation-test.sh → references/gates/mutation-kill.md (uses mutmut/cargo-mutants/Stryker if available)scripts/red-team.sh → references/gates/sast.md (20+ vulnerability patterns, JSON report)scripts/design-review.sh → references/gates/design-review.md (dep graph, circular deps, god modules)scripts/ci-integration.sh → references/gates/ci-integration.md (CI config detection + scoring)scripts/dynamic-analysis.sh → references/gates/dynamic-analysis.md (memory leaks, race conditions)scripts/perf-benchmark.sh → references/gates/performance.md (benchmark detection + regression)scripts/property-test.sh → references/gates/property-based.md (fuzzing, invariant checks)scripts/differential-test.sh → references/gates/differential.md (oracle comparison, metamorphic tests)scripts/proof-bundle.sh [path] [mode] → writes .wreckit/proof.json, dashboard.json, decision.md"Use wreckit to audit [project]. Don't change anything."
"Use wreckit to build [project] from this PRD."
"Use wreckit to fix [bug]. Prove nothing else breaks."
"Use wreckit to rebuild [project] in [framework]."
assets/dashboard/ contains a local web dashboard for viewing proof bundles across repos.
Run: node assets/dashboard/server.mjs (port 3939). Reads .wreckit/dashboard.json from projects.
When using Codex CLI to build/run projects:
--full-auto sandbox blocks npm install network access (ENOTFOUND registry.npmjs.org)--dangerously-bypass-approvals-and-sandbox flag insteadecho "$OPENAI_API_KEY" | codex login --with-api-key stores credentials to ~/.codex/auth.json~/.codex/config.toml with model = "gpt-5.2-codex" and [shell_environment_policy] inherit = "all"gpt-5.3-codex is Copilot/VS Code only — not available via direct API. Use gpt-5.2-codex.