Install
openclaw skills install verified-capability-evolverSafely improve agent capabilities with structured verification, rollback, and promotion gating. Enhances existing evolution workflows with optional SettlementWitness verification.
openclaw skills install verified-capability-evolverExtend existing capability evolution workflows with structured verification, rollback, and promotion gating.
This skill does not replace the underlying self-improvement system. It preserves the original learning, hook, and extraction workflow while adding a verification layer so permanent behavior changes are only promoted when they are proven.
This skill defines a verification workflow, not automatic data transmission.
An agent repeatedly generates invalid JSON for an API response. A fix is applied, but before promoting this as permanent behavior, the improvement must be verified.
Status: pending
Agent produced invalid JSON format for API responses
Ensure all outputs conform to required schema before returning
{ "expected": { "valid_json": true, "matches_schema": true } }
{ "valid_json": true, "matches_schema": true }
An agent should not just improve — it should prove that it improved.
The highest-stakes moment in self-improvement is promotion to permanent memory. A temporary fix becomes permanent behavior only after it passes verification.
When using SettlementWitness verification, provide a stable agent_id:
{wallet_address}:capability-evolver
Use the format {wallet_address}:capability-evolver so TrustScore history can compound correctly across sessions.
| Situation | Action |
|---|---|
| Command/operation fails | Log to .learnings/ERRORS.md |
| User corrects you | Log to .learnings/LEARNINGS.md with category correction |
| User wants missing feature | Log to .learnings/FEATURE_REQUESTS.md |
| API/external tool fails | Log to .learnings/ERRORS.md with integration details |
| Knowledge was outdated | Log to .learnings/LEARNINGS.md with category knowledge_gap |
| Found better approach | Log to .learnings/LEARNINGS.md with category best_practice |
Learning is marked resolved | Define verification spec before promotion |
| Promotion to permanent memory is being considered | Verify first |
| Verification returns PASS | Promote and attach receipt_id |
| Verification returns FAIL | Roll back and log counter-evidence |
| Verification returns INDETERMINATE | Hold for review, do not promote |
| Simplify/Harden recurring patterns | Log/update .learnings/LEARNINGS.md with Source: simplify-and-harden and a stable Pattern-Key |
| Similar to existing entry | Link with **See Also**, consider priority bump |
| Workflow improvements | Promote to AGENTS.md (OpenClaw workspace) after verification PASS |
| Tool gotchas | Promote to TOOLS.md (OpenClaw workspace) after verification PASS |
| Behavioral patterns | Promote to SOUL.md (OpenClaw workspace) after verification PASS |
OpenClaw is the primary platform for this skill. It uses workspace-based prompt injection with automatic skill loading.
Via ClawdHub (recommended):
clawdhub install verified-capability-evolver
Manual:
git clone https://github.com/your-org/verified-capability-evolver.git ~/.openclaw/skills/verified-capability-evolver
OpenClaw injects these files into every session:
~/.openclaw/workspace/
├── AGENTS.md # Multi-agent workflows, delegation patterns
├── SOUL.md # Behavioral guidelines, personality, principles
├── TOOLS.md # Tool capabilities, integration gotchas
├── MEMORY.md # Long-term memory (main session only)
├── memory/ # Daily memory files
│ └── YYYY-MM-DD.md
└── .learnings/ # This skill's log files
├── LEARNINGS.md
├── ERRORS.md
└── FEATURE_REQUESTS.md
mkdir -p ~/.openclaw/workspace/.learnings
Then create the log files (or copy from assets/):
LEARNINGS.md — corrections, knowledge gaps, best practicesERRORS.md — command failures, exceptionsFEATURE_REQUESTS.md — user-requested capabilitiesWhen learnings prove broadly applicable, promote them to workspace files:
| Learning Type | Promote To | Example |
|---|---|---|
| Behavioral patterns | SOUL.md | "Be concise, avoid disclaimers" |
| Workflow improvements | AGENTS.md | "Spawn sub-agents for long tasks" |
| Tool gotchas | TOOLS.md | "Git push needs auth configured first" |
OpenClaw provides tools to share learnings across sessions:
For automatic reminders at session start:
# Copy hook to OpenClaw hooks directory
cp -r hooks/openclaw ~/.openclaw/hooks/verified-capability-evolver
# Enable it
openclaw hooks enable verified-capability-evolver
See references/openclaw-integration.md for complete details.
For Claude Code, Codex, Copilot, or other agents, create .learnings/ in your project:
mkdir -p .learnings
Copy templates from assets/ or create files with headers.
When errors or corrections occur:
.learnings/ERRORS.md, LEARNINGS.md, or FEATURE_REQUESTS.mdCLAUDE.md - project facts and conventionsAGENTS.md - workflows and automation.github/copilot-instructions.md - Copilot contextAppend to .learnings/LEARNINGS.md:
## [LRN-YYYYMMDD-XXX] category
**Logged**: ISO-8601 timestamp
**Priority**: low | medium | high | critical
**Status**: pending
**Area**: frontend | backend | infra | tests | docs | config
### Summary
One-line description of what was learned
### Details
Full context: what happened, what was wrong, what's correct
### Suggested Action
Specific fix or improvement to make
### Metadata
- Source: conversation | error | user_feedback | simplify-and-harden
- Related Files: path/to/file.ext
- Tags: tag1, tag2
- See Also: LRN-20250110-001 (if related to existing entry)
- Pattern-Key: simplify.dead_code | harden.input_validation (optional, for recurring-pattern tracking)
- Recurrence-Count: 1 (optional)
- First-Seen: 2025-01-15 (optional)
- Last-Seen: 2025-01-15 (optional)
---
Append to .learnings/ERRORS.md:
## [ERR-YYYYMMDD-XXX] skill_or_command_name
**Logged**: ISO-8601 timestamp
**Priority**: high
**Status**: pending
**Area**: frontend | backend | infra | tests | docs | config
### Summary
Brief description of what failed
### Error
Actual error message or output
### Context
- Command/operation attempted
- Input or parameters used
- Environment details if relevant
### Suggested Fix
If identifiable, what might resolve this
### Metadata
- Reproducible: yes | no | unknown
- Related Files: path/to/file.ext
- See Also: ERR-20250110-001 (if recurring)
---
Append to .learnings/FEATURE_REQUESTS.md:
## [FEAT-YYYYMMDD-XXX] capability_name
**Logged**: ISO-8601 timestamp
**Priority**: medium
**Status**: pending
**Area**: frontend | backend | infra | tests | docs | config
### Requested Capability
What the user wanted to do
### User Context
Why they needed it, what problem they're solving
### Complexity Estimate
simple | medium | complex
### Suggested Implementation
How this could be built, what it might extend
### Metadata
- Frequency: first_time | recurring
- Related Features: existing_feature_name
---
Format: TYPE-YYYYMMDD-XXX
LRN (learning), ERR (error), FEAT (feature)001, A7B)Examples: LRN-20250115-001, ERR-20250115-A3F, FEAT-20250115-002
When an issue appears fixed, do not immediately treat it as permanent learning.
**Status**: pending → **Status**: in_progress**Status** → resolved**Status** to pendingAdd after Metadata:
### Resolution
- **Resolved**: 2026-03-25T09:00:00Z
- **Verification-Spec**: Output must match schema exactly and contain no hallucinated fields
- **Settlement Verdict**: PASS | FAIL | INDETERMINATE
- **Receipt ID**: sha256:...
- **Notes**: Brief description of what was done
Other status values:
in_progress - Actively being worked onwont_fix - Decided not to address (add reason in Resolution notes)promoted - Elevated to CLAUDE.md, AGENTS.md, SOUL.md, TOOLS.md, or .github/copilot-instructions.md after PASS onlyWhen a learning is broadly applicable (not a one-off fix), promote it to permanent project memory.
| Target | What Belongs There |
|---|---|
CLAUDE.md | Project facts, conventions, gotchas for all Claude interactions |
AGENTS.md | Agent-specific workflows, tool usage patterns, automation rules |
.github/copilot-instructions.md | Project context and conventions for GitHub Copilot |
SOUL.md | Behavioral guidelines, communication style, principles (OpenClaw workspace) |
TOOLS.md | Tool capabilities, usage patterns, integration gotchas (OpenClaw workspace) |
Promotion is the highest-stakes moment in the workflow because it turns a temporary fix into permanent agent behavior. A learning is only promoted to permanent memory if verification returns PASS. All other verdicts (FAIL or INDETERMINATE) trigger rollback and logging. Promotion is strictly gated by verification. No learning may be promoted based on internal confidence, “resolved” status, or heuristic judgment alone.
**Status** → promoted**Promoted**: CLAUDE.md, AGENTS.md, SOUL.md, TOOLS.md, or .github/copilot-instructions.md**Verified**: true**Receipt ID**: sha256:...If external verification is used:
Learning (verbose):
Project uses pnpm workspaces. Attempted
npm installbut failed. Lock file ispnpm-lock.yaml. Must usepnpm install.
In CLAUDE.md (concise):
## Build & Dependencies
- Package manager: pnpm (not npm) - use `pnpm install`
Learning (verbose):
When modifying API endpoints, must regenerate TypeScript client. Forgetting this causes type mismatches at runtime.
In AGENTS.md (actionable):
## After API Changes
1. Regenerate client: `pnpm run generate:api`
2. Check for type errors: `pnpm tsc --noEmit`
If a previously promoted learning later fails verification:
.learnings/LEARNINGS.md or .learnings/ERRORS.mdRollback is required because unverified permanent memory silently compounds bad behavior.
If logging something similar to an existing entry:
grep -r "keyword" .learnings/**See Also**: ERR-20250110-001 in MetadataUse this workflow to ingest recurring patterns from the simplify-and-harden
skill and turn them into durable prompt guidance.
simplify_and_harden.learning_loop.candidates from the task summary.pattern_key as the stable dedupe key..learnings/LEARNINGS.md for an existing entry with that key:
grep -n "Pattern-Key: <pattern_key>" .learnings/LEARNINGS.mdRecurrence-CountLast-SeenSee Also links to related entries/tasksLRN-... entrySource: simplify-and-hardenPattern-Key, Recurrence-Count: 1, and First-Seen/Last-SeenPromote recurring patterns into agent context/system prompt files when all are true:
Recurrence-Count >= 3Promotion targets:
CLAUDE.mdAGENTS.md.github/copilot-instructions.mdSOUL.md / TOOLS.md for OpenClaw workspace-level guidance when applicableWrite promoted rules as short prevention rules (what to do before/while coding), not long incident write-ups.
Use this shape when verifying a proposed improvement:
{
"task_id": "improvement-fix-json-output-001",
"agent_id": "0x123:capability-evolver",
"spec": {
"expected": {
"schema_valid": true,
"hallucinated_fields": false
}
},
"output": {
"schema_valid": true,
"hallucinated_fields": false
}
}
Interpretation:
Review .learnings/ at natural breakpoints:
# Count pending items
grep -h "Status\*\*: pending" .learnings/*.md | wc -l
# List pending high-priority items
grep -B5 "Priority\*\*: high" .learnings/*.md | grep "^## \["
# Find learnings for a specific area
grep -l "Area\*\*: backend" .learnings/*.md
Automatically log when you notice:
Corrections (→ learning with correction category):
Feature Requests (→ feature request):
Knowledge Gaps (→ learning with knowledge_gap category):
Errors (→ error entry):
| Priority | When to Use |
|---|---|
critical | Blocks core functionality, data loss risk, security issue |
high | Significant impact, affects common workflows, recurring issue |
medium | Moderate impact, workaround exists |
low | Minor inconvenience, edge case, nice-to-have |
Use to filter learnings by codebase region:
| Area | Scope |
|---|---|
frontend | UI, components, client-side code |
backend | API, services, server-side code |
infra | CI/CD, deployment, Docker, cloud |
tests | Test files, testing utilities, coverage |
docs | Documentation, comments, READMEs |
config | Configuration files, environment, settings |
Keep learnings local (per-developer):
.learnings/
Track learnings in repo (team-wide): Don't add to .gitignore - learnings become shared knowledge.
Hybrid (track templates, ignore entries):
.learnings/*.md
!.learnings/.gitkeep
Enable automatic reminders through agent hooks. This is opt-in - you must explicitly configure hooks.
Create .claude/settings.json in your project:
{
"hooks": {
"UserPromptSubmit": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "./skills/verified-capability-evolver/scripts/activator.sh"
}]
}]
}
}
This injects a learning evaluation reminder after each prompt (~50-100 tokens overhead).
{
"hooks": {
"UserPromptSubmit": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "./skills/verified-capability-evolver/scripts/activator.sh"
}]
}],
"PostToolUse": [{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "./skills/verified-capability-evolver/scripts/error-detector.sh"
}]
}]
}
}
| Script | Hook Type | Purpose |
|---|---|---|
scripts/activator.sh | UserPromptSubmit | Reminds to evaluate learnings after tasks and verify before promotion |
scripts/error-detector.sh | PostToolUse (Bash) | Triggers on command errors |
scripts/extract-skill.sh | manual helper | Extracts reusable skills from learnings |
See references/hooks-setup.md for detailed configuration and troubleshooting.
When a learning is valuable enough to become a reusable skill, extract it using the provided helper.
A learning qualifies for skill extraction when ANY of these apply:
| Criterion | Description |
|---|---|
| Recurring | Has See Also links to 2+ similar issues |
| Verified | Status is resolved with working fix |
| Non-obvious | Required actual debugging/investigation to discover |
| Broadly applicable | Not project-specific; useful across codebases |
| User-flagged | User says "save this as a skill" or similar |
./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name --dry-run
./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name
promoted_to_skill, add Skill-PathIf you prefer manual creation:
skills/<skill-name>/SKILL.mdassets/SKILL-TEMPLATE.mdname and descriptionThis skill works across different AI coding agents with agent-specific activation.
Activation: Hooks (UserPromptSubmit, PostToolUse)
Setup: .claude/settings.json with hook configuration
Detection: Automatic via hook scripts
Activation: Hooks (same pattern as Claude Code)
Setup: .codex/settings.json with hook configuration
Detection: Automatic via hook scripts
Activation: Manual (no hook support)
Setup: Add to .github/copilot-instructions.md:
## Verified Capability Evolver
After solving non-obvious issues, consider logging to `.learnings/`:
1. Use the format from this skill
2. Link related entries with See Also
3. Define verification specs before promotion
4. Promote only after verification PASS
Ask in chat: "Should I log this as a learning?"
Detection: Manual review at session end
Activation: Workspace injection + inter-agent messaging Setup: See "OpenClaw Setup" section above Detection: Via session tools and workspace files
Regardless of agent, apply verified evolution when you:
For Copilot users, add this to your prompts when relevant:
After completing this task, evaluate if any learnings should be logged to
.learnings/and whether any claimed improvement needs verification before promotion.
Or use quick prompts: