Autooptimise

v0.1.0

Autonomously optimise any OpenClaw skill using a benchmark-driven experiment loop. Scores skill outputs 0-10 across 4 dimensions, identifies the lowest-scori...

⭐ 0· 102·0 current·0 all-time

by@wealthvisionai-source

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for wealthvisionai-source/autooptimise.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Autooptimise" (wealthvisionai-source/autooptimise) from ClawHub.
Skill page: https://clawhub.ai/wealthvisionai-source/autooptimise
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install autooptimise

ClawHub CLI

Package manager switcher

npx clawhub@latest install autooptimise

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

Name and description (optimise other skills) match the instructions (read a target SKILL.md, run benchmark tasks, propose diffs). However README/SKILL.md assert "No external dependencies" / "no network calls beyond your existing model provider" while multiple places describe live validation and real tool/API calls (wttr.in, gh CLI). That contradiction between claimed constraints and actual behaviour is unexpected.

Instruction Scope

Runtime instructions explicitly tell the agent to read target skill files, send prompts that 'activate the target skill', run live tool calls where possible, and apply diffs to the skill file (only after approval). Those actions are necessary for an optimiser, but they grant the agent broad capability to exercise the target skill (which itself may read env vars, call network endpoints, or run tools). The docs also reference filesystem paths (e.g. ~/.openclaw/skills/...) despite the skill declaring no required config paths—this implicit file I/O should be made explicit.

✓

Install Mechanism

Instruction-only (no install, no binaries, no extracted archives). This minimizes supply-chain risk since nothing is written by an installer. The only code is runtime instructions and bundled benchmark files.

ℹ

Credentials

The skill declares no environment variables or credentials (good), but it implicitly relies on access to your OpenClaw installation, installed tools (gh, wttr.in access), and whatever model provider you already have configured. It does not declare required config paths even though it expects to read and (with approval) write other skills' SKILL.md files—this implicit need for filesystem access should be disclosed and considered.

✓

Persistence & Privilege

always is false and autonomous invocation is permitted (the platform default). The skill does not demand permanent inclusion or hidden privileges, and it documents a human approval gate before applying changes. Scheduling/heartbeat suggestions could enable periodic runs if the user configures them, so users should opt into that intentionally.

What to consider before installing

This skill conceptually fits its purpose but has a few red flags you should consider before running it against real skills: - The README/SCHEMA claims "no external network calls" but the tool explicitly describes live validation (wttr.in, gh) and running real tool calls; assume the loop may trigger network and CLI activity. If you need offline-only behaviour, don't run it until that is clarified. - The agent will read (and with your approval, write) other skills' SKILL.md files. Inspect target SKILL.md files first for any sensitive content and avoid running autooptimise on skills that access secrets, credentials, or perform destructive actions. - Require explicit human approval for every proposed change (the skill states this, but enforce it operationally). Prefer to run initial experiments in a sandbox or test environment, not against production skills or accounts. - If you plan to use the heartbeat/scheduling suggestions, be explicit about limiting scope (which skills may be optimised) and frequency to avoid unexpected automated runs. If you want to proceed, ask for clarifications from the author about the network claim vs live validation and confirm the exact filesystem paths the skill will access. Running one dry/manual iteration on a harmless skill first (e.g., a simple local test skill) is recommended to verify behaviour.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🔬 Clawdis

Agentsvk97fgbrta2yr2g0943txq5vvk983j9ajBenchmark Driven skillvk97fgbrta2yr2g0943txq5vvk983j9ajOptimise skillsvk97fgbrta2yr2g0943txq5vvk983j9ajlatestvk97fgbrta2yr2g0943txq5vvk983j9aj

102downloads

0stars

1versions

Updated 1mo ago

v0.1.0

MIT-0

autooptimise

Autonomous benchmark-driven skill optimisation for OpenClaw. Inspired by Andrej Karpathy's autoresearch — the same modify → test → score → keep/discard loop, applied to agent skill quality instead of GPU training.

Trigger Phrases

"optimise my weather skill"
"run autooptimise on [skill-name]"
"benchmark my [skill-name] skill"
"improve my skill overnight"

Key Files

File	Purpose
`benchmark/tasks.json`	Test task suite (prompts + expected qualities)
`benchmark/scorer.md`	LLM judge scoring rubric
`runner/run_experiment.md`	Autonomous loop instructions (load this next)
`runner/experiment_log.md`	Auto-created run log (gitignored)

How to Run

Read runner/run_experiment.md — it contains the full loop instructions
Confirm the target skill with the user if not specified
Execute the loop (max 3 iterations)
Present proposed changes for human approval — never auto-apply

Scoring

Use the best available LLM judge model (prefer a strong reasoning model). Score each task 0–10 on:

Accuracy — correct answer / correct tool called
Conciseness — no padding, no unnecessary text
Tool usage — right tool, right parameters
Formatting — output matches expected format

Full rubric: benchmark/scorer.md

Safety Rules

Never auto-apply changes. Always present a diff and wait for explicit human approval.
Never modify benchmark/tasks.json or benchmark/scorer.md during a run.
Never exceed 3 iterations per run in v0.1.
Log every action to runner/experiment_log.md.

Comments

Loading comments...