Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

self evolving agent

Build a goal-driven self-learning loop for OpenClaw and coding agents. Use when the agent should not only log mistakes, but diagnose capability gaps, maintai...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 99 · 0 current installs · 0 all-time installs
byRange King@rangeking
MIT-0
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name, README, SKILL.md and file layout align: the skill expects to read/write an OpenClaw workspace, maintain ledgers, generate training units, and optionally provide hooks. Those capabilities reasonably require the files and ledgers the repo contains.
Instruction Scope
SKILL.md explicitly instructs the agent to read and update workspace files (assets/, modules/, system/), run light or full loops, and optionally enable hooks. That behavior is in‑scope for a capability‑evolution skill, but it also instructs running supplied scripts and copying hook files into ~/.openclaw which grants the skill the ability to persist and act across sessions — you should review the scripts and hook handler for unexpected actions before enabling.
Install Mechanism
There is no formal install spec (instruction-only), but the repo includes executable scripts and an OpenClaw hook. Installation options point to GitHub or a local copy (both reasonable). The GitHub source is an expected host; no arbitrary shorteners or third‑party binary downloads are referenced. Still, because the package contains scripts that may be executed locally, inspect them prior to running.
!
Credentials
Registry metadata declares no required env vars, but repository artifacts (agents/openai.yaml, benchmark scripts, run-benchmark.py, run-evals.py, and handler.ts) suggest model-in-the-loop or external API use that typically requires credentials (e.g., OPENAI_API_KEY) or network access. The skill does not document required credentials or network endpoints — this mismatch is a risk and should be validated by reading the scripts and hook code.
Persistence & Privilege
always is false and hooks are optional. The skill asks to bootstrap a persistent workspace (~/.openclaw/workspace/.evolution) and optionally enable a hook, which is appropriate for a memory/evolution skill. There is no claim it will force-enable itself or modify other skills' configs.
Scan Findings in Context
[pre-scan-injection-none] expected: Static pre-scan reported no injection signals. However, multiple scripts and a hook handler exist in the repo; absence of regex hits is not proof of safety — manual review of scripts/handler.ts is recommended to confirm no hidden endpoints or credential use.
What to consider before installing
This skill appears to be what it says: a workspace-based capability-evolution system. The main risk is that it includes shell/Python scripts and a hook handler that can run on your machine and may call models or networked APIs. Before installing or enabling hooks: 1) open and review scripts/* (bootstrap-workspace.sh, run-evals.py, run-benchmark.py, error-detector.sh, activator.sh, migrate-self-improving.py) and hooks/openclaw/handler.ts for network calls, subprocess execs, or credential reads; 2) check for expected env vars (OPENAI_API_KEY or similar) or hardcoded endpoints; 3) back up your existing ~/.openclaw/workspace/.learnings and other workspace files; 4) prefer manual cloning and local inspection rather than asking the agent to fetch/enable the skill automatically; 5) run scripts in a sandboxed/dev workspace first (not on production data); and 6) only enable the hook after you are satisfied no unexpected network exfiltration or privilege changes occur. If you want, provide the contents of the scripts and handler.ts and I can flag any suspicious code patterns specifically.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.1.0
Download zip
latestvk97fqj5fhpb1f8n02t6dhpgb8s839zy6

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Self-Evolving Agent

self-evolving-agent upgrades passive self-improvement into an explicit capability evolution system.

Use this skill when any of the following is true:

  • A task is difficult, novel, high-stakes, or long-horizon.
  • An error, correction, or near-miss reveals a deeper capability weakness.
  • The same failure pattern appears more than once.
  • A useful tactic might deserve promotion into long-term context, but has not been validated yet.
  • You want to understand not just what went wrong, but what the agent can do now, what it still cannot do, and what it should train next.

Default to the light loop first. Escalate into the full capability-evolution loop only when the task or evidence justifies the extra cost.

Core Principle

Do not treat logging as learning.

This skill separates six states of progress:

  1. recorded
  2. understood
  3. practiced
  4. passed
  5. generalized
  6. promoted

A lesson only becomes long-term policy after it survives training and transfer.

What This Skill Preserves From Classic Self-Improvement

Keep the original strengths as the memory layer:

  • Log errors, corrections, learnings, and feature requests.
  • Detect recurring patterns.
  • Review prior learnings before major work.
  • Promote only high-value guidance into long-term context.
  • Use workspace files and hooks to keep memory persistent across sessions.

What This Skill Adds

This skill adds an active learning layer:

  • Capability map with levels, failure modes, and upgrade criteria
  • Proactive learning agenda that selects the next 1-3 capabilities to train
  • Task-level diagnosis of root causes
  • Training unit generation for recurring weaknesses
  • Evaluation gates that separate recording from mastery
  • Transfer checks on new tasks before promotion
  • Reflection routines that force self-explanation and counterexamples

Closed Loop

Run the following loop, in order:

  1. Classify the task.
  2. Retrieve relevant learnings and related capabilities.
  3. Run a pre-task risk diagnosis.
  4. Choose an execution strategy.
  5. Perform the task.
  6. Run post-task reflection.
  7. Update the capability map.
  8. Generate a training unit if weakness or recurrence is detected.
  9. Evaluate learning progress.
  10. Promote only validated strategies.

Effort Modes

Light loop

Use the lightweight pass when all of the following are true:

  • The task is familiar.
  • Consequence is low.
  • Horizon is short.
  • No active agenda focus is central to the task.
  • No failure, near-miss, or user rescue exposed a deeper weakness.
  • No learning needs training, evaluation, or promotion.

In the light loop:

  1. Retrieve only the most relevant 1-3 memory items.
  2. Name the single most likely risk and one verification check.
  3. Do the work.
  4. Log only unusually reusable lessons.
  5. Stop unless an escalation trigger fires.

Full loop

Run the full loop when any of the following is true:

  • The task is mixed or unfamiliar.
  • Consequence is medium or high and failure would matter.
  • Horizon is medium or long with many dependencies.
  • An active agenda focus is relevant.
  • A failure, near-miss, or user correction suggests a reusable weakness.
  • A similar issue repeated, transfer failed, or promotion is under consideration.
  • The task itself is deliberate practice, evaluation, or promotion review.

Escalation triggers

Escalate from light to full when any of the following appears during execution:

  • non-trivial rework
  • verification catches a real defect
  • the user had to rescue or redirect the task
  • a missed retrieval or repeated pattern appears
  • the learning looks broad enough to affect future policy

Control Loop

Outside the 10-step task loop, maintain an explicit learning agenda.

Run an agenda review when any of the following is true:

  • The workspace is new and no calibrated capability map exists.
  • Five meaningful cycles have passed since the last review.
  • A structural_gap or failed transfer was detected.
  • A long-horizon or unfamiliar task is about to begin.

During agenda review:

  • choose the top 1-3 capabilities to train next
  • defer lower-leverage weaknesses instead of training everything at once
  • define what evidence would retire or advance each focus
  • link each focus to existing or new training units

File Map

  • Main orchestration: system/coordinator.md
  • Learning agenda and review cycle: modules/learning-agenda.md
  • Diagnosis: modules/diagnose.md
  • Capability definitions and update rules: modules/capability-map.md
  • Training unit design: modules/curriculum.md
  • Learning evaluation ladder: modules/evaluator.md
  • Promotion gate: modules/promotion.md
  • Reflection protocol: modules/reflection.md

Assets and ledgers:

  • assets/LEARNINGS.md
  • assets/ERRORS.md
  • assets/FEATURE_REQUESTS.md
  • assets/CAPABILITIES.md
  • assets/LEARNING_AGENDA.md
  • assets/TRAINING_UNITS.md
  • assets/EVALUATIONS.md

Operating Rules

During migration from self-improving-agent

  • Treat .evolution/legacy-self-improving/ as a read-only memory layer.
  • Search the legacy files during retrieval if they exist.
  • Do not bulk-convert every old entry into the new schema on day one.
  • Normalize a legacy learning into .evolution only when it is reused, agenda-worthy, or needed for evaluation.

Before a substantial task

  • Read system/coordinator.md.
  • Check whether assets/LEARNING_AGENDA.md requires a review cycle.
  • Retrieve relevant entries from LEARNINGS, ERRORS, CAPABILITIES, and TRAINING_UNITS.
  • Identify the top 1-3 risk capabilities for this task.

After every meaningful task

  • Log incident-level observations in the memory files.
  • Diagnose the weakest capability involved.
  • Update the capability map with evidence, not vibes.
  • Refresh the learning agenda if a focus should change.
  • If the issue is recurring or high-leverage, create or revise a training unit.
  • Record evaluation status using the six-state ladder.

Before promotion

  • Read modules/evaluator.md and modules/promotion.md.
  • Confirm the strategy has passed training and succeeded in at least one transfer scenario.
  • Promote the smallest stable rule that explains the success.

When Not To Use Heavyweight Evaluation

Use a lightweight pass when all of the following are true:

  • The task was trivial.
  • No real uncertainty or failure occurred.
  • No new behavior should be generalized.

In that case, log the learning only if it is unusually reusable.

Output Contracts

When this skill is active, prefer producing these artifacts:

  • A learning agenda review when triggers fire
  • A short pre-task risk diagnosis
  • A post-task capability diagnosis
  • A TRAINING_UNIT when recurrence or weakness appears
  • An EVALUATION entry when progress is tested
  • A promotion decision with explicit evidence

Recommended Workflow

  1. Read system/coordinator.md.
  2. Load only the modules needed for the current step.
  3. Use the asset templates as the canonical output format.
  4. Keep long-term memory strict: only promote validated patterns.

Files

39 total
Select a file
Select a file to preview.

Comments

Loading comments…