self evolving agent
Build a goal-driven self-learning loop for OpenClaw and coding agents. Use when the agent should not only log mistakes, but diagnose capability gaps, maintai...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
Self-Evolving Agent
self-evolving-agent upgrades passive self-improvement into an explicit capability evolution system.
Use this skill when any of the following is true:
- A task is difficult, novel, high-stakes, or long-horizon.
- An error, correction, or near-miss reveals a deeper capability weakness.
- The same failure pattern appears more than once.
- A useful tactic might deserve promotion into long-term context, but has not been validated yet.
- You want to understand not just what went wrong, but what the agent can do now, what it still cannot do, and what it should train next.
Default to the light loop first. Escalate into the full capability-evolution loop only when the task or evidence justifies the extra cost.
Core Principle
Do not treat logging as learning.
This skill separates six states of progress:
recordedunderstoodpracticedpassedgeneralizedpromoted
A lesson only becomes long-term policy after it survives training and transfer.
What This Skill Preserves From Classic Self-Improvement
Keep the original strengths as the memory layer:
- Log errors, corrections, learnings, and feature requests.
- Detect recurring patterns.
- Review prior learnings before major work.
- Promote only high-value guidance into long-term context.
- Use workspace files and hooks to keep memory persistent across sessions.
What This Skill Adds
This skill adds an active learning layer:
- Capability map with levels, failure modes, and upgrade criteria
- Proactive learning agenda that selects the next 1-3 capabilities to train
- Task-level diagnosis of root causes
- Training unit generation for recurring weaknesses
- Evaluation gates that separate recording from mastery
- Transfer checks on new tasks before promotion
- Reflection routines that force self-explanation and counterexamples
Closed Loop
Run the following loop, in order:
- Classify the task.
- Retrieve relevant learnings and related capabilities.
- Run a pre-task risk diagnosis.
- Choose an execution strategy.
- Perform the task.
- Run post-task reflection.
- Update the capability map.
- Generate a training unit if weakness or recurrence is detected.
- Evaluate learning progress.
- Promote only validated strategies.
Effort Modes
Light loop
Use the lightweight pass when all of the following are true:
- The task is familiar.
- Consequence is low.
- Horizon is short.
- No active agenda focus is central to the task.
- No failure, near-miss, or user rescue exposed a deeper weakness.
- No learning needs training, evaluation, or promotion.
In the light loop:
- Retrieve only the most relevant 1-3 memory items.
- Name the single most likely risk and one verification check.
- Do the work.
- Log only unusually reusable lessons.
- Stop unless an escalation trigger fires.
Full loop
Run the full loop when any of the following is true:
- The task is mixed or unfamiliar.
- Consequence is medium or high and failure would matter.
- Horizon is medium or long with many dependencies.
- An active agenda focus is relevant.
- A failure, near-miss, or user correction suggests a reusable weakness.
- A similar issue repeated, transfer failed, or promotion is under consideration.
- The task itself is deliberate practice, evaluation, or promotion review.
Escalation triggers
Escalate from light to full when any of the following appears during execution:
- non-trivial rework
- verification catches a real defect
- the user had to rescue or redirect the task
- a missed retrieval or repeated pattern appears
- the learning looks broad enough to affect future policy
Control Loop
Outside the 10-step task loop, maintain an explicit learning agenda.
Run an agenda review when any of the following is true:
- The workspace is new and no calibrated capability map exists.
- Five meaningful cycles have passed since the last review.
- A
structural_gapor failed transfer was detected. - A long-horizon or unfamiliar task is about to begin.
During agenda review:
- choose the top 1-3 capabilities to train next
- defer lower-leverage weaknesses instead of training everything at once
- define what evidence would retire or advance each focus
- link each focus to existing or new training units
File Map
- Main orchestration:
system/coordinator.md - Learning agenda and review cycle:
modules/learning-agenda.md - Diagnosis:
modules/diagnose.md - Capability definitions and update rules:
modules/capability-map.md - Training unit design:
modules/curriculum.md - Learning evaluation ladder:
modules/evaluator.md - Promotion gate:
modules/promotion.md - Reflection protocol:
modules/reflection.md
Assets and ledgers:
assets/LEARNINGS.mdassets/ERRORS.mdassets/FEATURE_REQUESTS.mdassets/CAPABILITIES.mdassets/LEARNING_AGENDA.mdassets/TRAINING_UNITS.mdassets/EVALUATIONS.md
Operating Rules
During migration from self-improving-agent
- Treat
.evolution/legacy-self-improving/as a read-only memory layer. - Search the legacy files during retrieval if they exist.
- Do not bulk-convert every old entry into the new schema on day one.
- Normalize a legacy learning into
.evolutiononly when it is reused, agenda-worthy, or needed for evaluation.
Before a substantial task
- Read
system/coordinator.md. - Check whether
assets/LEARNING_AGENDA.mdrequires a review cycle. - Retrieve relevant entries from
LEARNINGS,ERRORS,CAPABILITIES, andTRAINING_UNITS. - Identify the top 1-3 risk capabilities for this task.
After every meaningful task
- Log incident-level observations in the memory files.
- Diagnose the weakest capability involved.
- Update the capability map with evidence, not vibes.
- Refresh the learning agenda if a focus should change.
- If the issue is recurring or high-leverage, create or revise a training unit.
- Record evaluation status using the six-state ladder.
Before promotion
- Read
modules/evaluator.mdandmodules/promotion.md. - Confirm the strategy has passed training and succeeded in at least one transfer scenario.
- Promote the smallest stable rule that explains the success.
When Not To Use Heavyweight Evaluation
Use a lightweight pass when all of the following are true:
- The task was trivial.
- No real uncertainty or failure occurred.
- No new behavior should be generalized.
In that case, log the learning only if it is unusually reusable.
Output Contracts
When this skill is active, prefer producing these artifacts:
- A learning agenda review when triggers fire
- A short pre-task risk diagnosis
- A post-task capability diagnosis
- A
TRAINING_UNITwhen recurrence or weakness appears - An
EVALUATIONentry when progress is tested - A promotion decision with explicit evidence
Recommended Workflow
- Read
system/coordinator.md. - Load only the modules needed for the current step.
- Use the asset templates as the canonical output format.
- Keep long-term memory strict: only promote validated patterns.
Files
39 totalComments
Loading comments…
