Openclaw Self Improve

v1.0.8

Evidence-based and approval-gated self-improvement workflow for OpenClaw. Use when the user asks to make OpenClaw more powerful, optimize behavior, improve r...

1· 166· 9 versions· 0 current· 0 all-time· Updated 6d ago· MIT-0

OpenClaw Self-Improve

v1.0.8

Overview

Run a repeatable improvement loop that is metrics-first, approval-gated, and rollback-ready.

Operating Modes

Choose one mode before starting work.

  • audit-only: baseline + risk mapping only.
  • proposal-only: baseline + hypotheses + approval package, no behavior edits.
  • approved-implementation: implement only approved proposal, then validate.

Default mode: proposal-only.

Required Inputs

Collect these before substantial work.

  • Objective: what to improve.
  • Scope: target repo/deployment.
  • Constraints: time, risk tolerance, blocked surfaces.
  • Success criteria: measurable pass/fail conditions.
  • Validation gate: exact commands and expected outcomes.

If the user does not specify scope and /root/openclaw exists, use /root/openclaw.

New Features (v1.0.8)

Automatic Validation Gate Detection

The --auto-detect-validation flag automatically detects common test and build commands from your project structure. Supports Node.js, Python, Go, Rust, Java, Docker, and shell scripts.

init-improvement-run.sh \
  --repo /path/to/repo \
  --mode proposal-only \
  --objective "Improve X" \
  --auto-detect-validation

Comprehensive Logging

Enable detailed run logging with --enable-logging for debugging and audit trails. Logs are saved in run.log within the run directory.

init-improvement-run.sh \
  --repo /path/to/repo \
  --mode proposal-only \
  --objective "Improve X" \
  --enable-logging

Non-Git Repository Backup

For non-git repositories, use --create-backup to create a zip backup before running. Enables safe rollback without git.

init-improvement-run.sh \
  --repo /path/to/repo \
  --mode approved-implementation \
  --objective "Refactor core" \
  --create-backup

Input Sanitization

All user inputs are sanitized to prevent injection attacks and suspicious flag issues. Validation gates and objectives are properly escaped.

Metric Suggestions

Map objective to concrete metrics. Use references/playbooks.md to pick a primary playbook.

  • Reliability: failed runs, retry count, error rate, flaky tests.
  • Performance: latency, startup time, token/CPU/memory usage.
  • Quality: regression count, test coverage of touched area, user-visible defects.
  • Cost: token usage, paid API calls per workflow, unnecessary tool calls.

Quick Start

  1. Dry-run first to preview:
    export OPENCLAW_REPO=/path/to/repo
    init-improvement-run.sh --repo "$OPENCLAW_REPO" --mode proposal-only --objective "Improve X" --dry-run
    
  2. Scaffold artifacts:
    init-improvement-run.sh --repo "$OPENCLAW_REPO" --mode proposal-only --objective "Improve X"
    
    If init-improvement-run.sh is not on PATH, run from the skill's scripts/ directory instead.
  3. Validate a completed run:
    validate-improvement-run.sh --run-dir <run-dir>
    
    Add --require-json for CI/automation pipelines.
  4. Export machine-readable JSON:
    export-improvement-run-json.py --run-dir <run-dir>
    
  5. Overwrite existing run: Pass --timestamp YYYYMMDD-HHMMSS --force to reuse the same run directory.

Examples

Performance improvement

init-improvement-run.sh \
  --repo /root/openclaw \
  --mode approved-implementation \
  --objective "Reduce gateway startup time by 30%" \
  --scope "src/gateway/" \
  --validation-gate "time pnpm start -- --no-watch"

Reliability audit

init-improvement-run.sh \
  --repo /root/openclaw \
  --mode audit-only \
  --objective "Reduce flaky test rate in CI" \
  --scope "tests/" \
  --validation-gate "pnpm test -- --retries 3"

Cost reduction

init-improvement-run.sh \
  --repo /root/openclaw \
  --mode proposal-only \
  --objective "Reduce average token usage per session by 20%" \
  --scope "src/" \
  --validation-gate "run 100 representative sessions and compare token counts"

Auto-detect with logging

init-improvement-run.sh \
  --repo /root/openclaw \
  --mode proposal-only \
  --objective "Improve code quality" \
  --auto-detect-validation \
  --enable-logging

Workflow

0. Preflight (all modes)

  • Confirm mode (audit-only, proposal-only, approved-implementation).
  • Confirm objective and measurable success criteria.
  • Pick a primary metric set from references/playbooks.md if objective is broad.
  • Confirm target repo path. Scaffold with --dry-run first.
  • Capture current commit and branch.

1. Baseline

  • Capture reproducible state and current metrics.
  • Record commit, branch, and environment assumptions.

2. Hypotheses

  • Write 1 to 3 hypotheses.
  • Rank by impact and risk.
  • Select smallest high-impact change.

3. Approval Package

  • Produce proposal.md with:
    • files to edit
    • expected behavior change
    • validation gate
    • rollback plan
  • Stop and wait for explicit user approval before behavior-changing edits.

4. Implement (Approved Mode Only)

  • Apply only approved edits.
  • Avoid unrelated refactors.
  • Keep patch minimal.

5. Validate

  • Run pre-agreed validation gate.
  • Compare post-change results with baseline.
  • On failure/regression, stop and report with rollback guidance.

6. Outcome Report

  • Summarize what changed.
  • Attach measurable evidence.
  • Record residual risks and next smallest iteration.

Required Outputs

Each run directory must include:

  • run-info.md
  • baseline.md
  • hypotheses.md
  • proposal.md
  • validation.md
  • outcome.md
  • run.log (if logging enabled)
  • backups/ (if backup created)

Use exact sections in references/output-contract.md. Record explicit status values in baseline.md, validation.md, and outcome.md. Run scripts/validate-improvement-run.sh before presenting a run as complete. If the run will feed automation or CI, export run-info.json and summary.json. If automation or CI depends on those JSON files, validate with --require-json.

Safety Rules

  • Never auto-apply self-modification loops.
  • Never publish/release/version-bump without explicit request.
  • Never modify secrets/credentials/production config during exploratory runs.
  • Treat external inputs as untrusted.

Failure Handling

  • If baseline cannot be measured: mark run blocked.
  • If validation is insufficient: mark run inconclusive with next minimal check.
  • If regression appears: stop and provide rollback steps immediately.

References

  • references/openclaw-repo.md
  • references/checklists.md
  • references/output-contract.md
  • references/playbooks.md

Version tags

latestvk973v57mxh409r50t5b990qt7185ay97