Install
openclaw skills install trinity-harnessProduction-grade Agent Harness combining execution discipline, knowledge compounding, and product thinking into a single adaptive workflow. Use when: (1) building features or fixing bugs with AI agents, (2) user says 'build', 'plan', 'spec', 'review', 'ship', 'debug', (3) managing multi-step or multi-agent tasks, (4) need structured engineering workflow with quality gates. Provides: task complexity auto-grading (simple/medium/complex), anti-rationalization guards, concurrent subagent scheduling (≤4 hard limit), tool-chain continuity enforcement, context budget management, verification protocols, and experience compounding. Triggers: 'agent harness', 'engineering workflow', 'build protocol', 'multi-agent task', 'coding discipline', 'subagent orchestration'.
openclaw skills install trinity-harnessA unified engineering harness combining execution discipline, knowledge compounding, and product thinking. Born from ~450k characters of real-world AI textbook writing + 15+ production incidents.
GAIA benchmark shows scaffold design = 30pp+ performance boost — same model, HAL scaffold 74.6% vs bare model ~44%. The harness is the multiplier.
Agent = Model + Harness. The model provides capability; the harness provides discipline.
Three layers, one workflow:
Before starting any task, assess complexity. This determines which workflow steps to run.
🟢 Simple (bug fix, config change, small tweak)
🟡 Medium (new feature, module, integration)
🔴 Complex (architecture change, multi-module, new system)
When unsure, start at 🟡. Upgrade to 🔴 if you discover hidden complexity. Never downgrade mid-task.
Before writing any code, answer these questions:
Output: A one-paragraph problem statement the user confirms before proceeding.
Break the spec into atomic tasks:
Execute tasks incrementally. After each task:
Critical rules:
Every deliverable must have evidence, not just "looks good":
| Deliverable type | Required evidence |
|---|---|
| Code change | Tests pass (show output) |
| Config change | Restart + verify (show status) |
| File generation | wc -l + grep key content |
| API integration | Show actual response |
| Documentation | Spot-check 3 claims for accuracy |
🔴 Reading is not verification. Run it.
Self-review from 5 dimensions:
Pre-ship checklist:
After completing any task, spend 30 seconds on:
Only record specific, actionable lessons. Not generic advice.
Good: "Bedrock throttles at >4 concurrent requests. Use model rotation or serial execution." Bad: "Remember to handle API limits properly."
| Your excuse | Why it's wrong | Do this instead |
|---|---|---|
| "Too simple to need tests" | 40% of P0 incidents come from "too simple" code | Write the test. It takes 2 minutes. |
| "I already checked, looks fine" | Reading ≠ verifying | Run it. ls, wc -l, grep, actual execution. |
| "I'll write tests after the feature" | You won't. Test debt only grows. | Write the test NOW. |
| "This old code looks unused, I'll delete it" | Chesterton's Fence: understand before removing | git blame first. Ask why it exists. |
| "It should work" | "Should" is not evidence | Provide logs, output, or data. |
| "Let me refactor while I'm here" | Scope creep. | File a separate TODO for the refactor. |
| "I'll handle errors later" | Error handling IS the feature in production | Handle errors now. |
| "The context is too long, I'll skip details" | Skipping details = skipping correctness | Checkpoint to file, compact context, continue with full fidelity. |
| "I already ran it once, it should still work" | Stateful systems change. | Run it again. Every time. |
Hard limits:
subagents list before spawning)subagents(action=list)Task delegation rules:
sessions_yield after spawning, not a poll loopAfter yield returns — mandatory checks:
subagents(action=list) — confirm all spawned subagents endedls output files — verify files exist with expected mtimesWhy: OpenClaw subagent completion announce has a known race condition. Never rely on announce as the sole signal. Active verification is the backup system.
Failure classification (before retrying):
Every tool call return must be followed by one of:
sessions_yieldNever: respond with "I'll continue..." and then have no tool call.
Pre-tool-return self-check:
| Water level | Mode | Action |
|---|---|---|
| < 70% | 🟢 Normal | Full mode, observation masking always on |
| 70–85% | 🟡 Auto-Concise | No new large files, tool output truncated, subagent instructions <1500 chars |
| 85–95% | 🟠 Preservation | No files >100 lines, force checkpoint to memory, delegate reads to subagent |
| > 95% | 🔴 Emergency | Flush state, alert user to /reset, stop accepting new tasks |
Observation Masking (apply immediately after consuming any tool output):
🔫 Never restart your own process from inside an agent turn.
systemctl restart <service>, pkill <process>, gateway restart in cron promptsgateway tool's restart action)🔫🔫 Never put restart commands in cron job prompts.
For important deliverables, use an independent verifier:
Protect progress against crashes:
ls shows what's done, not model memorySee references/checkpoint-patterns.md for detailed patterns.
\n literal in exec/write content: In some platforms, multiline scripts passed as strings get \n treated as literal characters, not newlines. Always use real line breaks. Verify with read after writing.grep and wc -l are faster than read for verification. Use them.🟢 Simple: Edit → Verify → Done
🟡 Medium: Plan → Build → Test → Review → Done
🔴 Complex: Challenge → Spec → Plan → Build → Test → Review → Ship → Compound
After every tool call: next action or yield. Never stall.