Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Autoresearch

Automatically improve OpenClaw skills, prompts, or articles through iterative mutation-testing loops. Inspired by Karpathy's autoresearch. Use when user says...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
2 · 40 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill claims to iteratively improve SKILL.md, prompts, or articles. The provided SKILL.md and the helper script (scripts/run_eval.py) consistently implement that: they read/write SKILL.md, create .snapshots, generate checklists, and score mutations. There are no unrelated environment variables, binaries, or network endpoints required, so requested capabilities are proportional to the described purpose.
Instruction Scope
The runtime instructions explicitly require reading and (for Skill mode) writing the target SKILL.md at ~/.openclaw/skills/<skill-name>/SKILL.md and creating snapshots in a .snapshots directory. That is coherent for a tool that edits skills, but it does mean the skill will modify user files. Prompt/Article modes state they should avoid writing to disk unless needed; the included script supports file I/O only for Skill mode. User confirmation is described in the workflow (pause after batches), but actual enforcement depends on the agent implementation — the user should expect and approve filesystem changes before running.
Install Mechanism
There is no install spec and no external downloads. This is an instruction-only skill with an optional helper Python script included. No archives or network fetches are performed by the provided files.
Credentials
The skill declares no required environment variables, no credentials, and no config paths other than the target skill directory under the user's home (~/.openclaw/skills). There are no indications of requests for unrelated secrets or cloud credentials.
Persistence & Privilege
always:false (normal). The helper will create snapshots in ~/.openclaw/skills/<skill>/ .snapshots and will write changes to SKILL.md when operating in Skill mode. This is expected behavior for a mutation-based editor, but it does constitute persistent modification of user skill files — the user should ensure they want this and review snapshots. The skill does not modify other skills' config beyond writing to the target skill directory.
Assessment
This skill appears internally consistent with its purpose, but it will read and (for Skill mode) modify SKILL.md files under ~/.openclaw/skills and create a .snapshots directory there. Before installing or running it: (1) inspect the included scripts/run_eval.py yourself (it is the only code file) to confirm you are comfortable with its behavior; (2) run the tool on a copy of a skill (not production) to observe changes and scoring behavior; (3) ensure the agent asks for and obtains your explicit confirmation before it writes to any skill path or proceeds beyond the first batch; (4) if you do not want any disk changes, avoid giving it Skill mode access or restrict the agent's filesystem permissions. No network endpoints or credentials are requested by the skill, which reduces exfiltration risk, but filesystem modification is intrinsic to its function — treat snapshots as your first line of rollback and verify them before accepting changes.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk9740pe6ygds0t7t93rv0masps8385d5

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

autoresearch-pro

Overview

Automatically improve any OpenClaw skill, prompt, or article through iterative mutation-testing: small edits → run test cases → score with checklist → keep improvements, discard regressions.

Inspired by Karpathy/autoresearch.

Supports three optimization modes:

ModeInputOutput
SkillPath to a skill directoryImproved SKILL.md
PromptA prompt text stringImproved prompt
ArticleAn article/document textImproved article

Workflow

Step 1 — Identify Mode and Input

Ask the user to confirm:

  • Mode 1 — Skill: User says "optimize [skill-name]" or provides a skill path
  • Mode 2 — Prompt: User says "optimize this prompt" or pastes a prompt
  • Mode 3 — Article: User says "improve this article" or pastes article text

For Skill mode, resolve the skill path to ~/.openclaw/skills/<skill-name>/SKILL.md. For Prompt/Article mode, keep the text in context (do not write to disk unless needed).

Step 2 — Generate Checklist (10 Questions)

Read the target content first. Then generate 10 diverse, specific yes/no checklist questions relevant to the content type:

For Skill mode (same as before):

#DimensionWhat to Check
1Description clarityIs the frontmatter description precise and actionable?
2Trigger coverageDoes it cover the main real-world use cases?
3Workflow structureAre steps clearly sequenced and unambiguous?
4Error guidanceDoes it handle error states and edge cases?
5Tool usage accuracyAre tool names and parameters correct for OpenClaw?
6Example qualityDo examples reflect real usage patterns?
7ConcisenessIs content free of redundant repetition?
8Freedom calibrationIs instruction specificity appropriate?
9Reference qualityAre references and links accurate?
10CompletenessAre all sections filled with real content?

For Prompt mode (10 tailored questions):

#DimensionWhat to Check
1Goal clarityDoes the prompt state a clear, specific goal?
2Role/toneIs the desired role or tone specified?
3Input formatIs the input format clearly described?
4Output formatIs the expected output format specified?
5ConstraintsAre key constraints and boundaries stated?
6Context sufficiencyIs enough context provided to avoid hallucination?
7Edge casesDoes it handle ambiguous or edge case inputs?
8ConcisenessIs it free of redundant or contradictory instructions?
9ActionabilityAre instructions concrete and actionable vs. vague?
10CompletenessAre all necessary elements for the task present?

For Article mode (10 tailored questions):

#DimensionWhat to Check
1Title qualityDoes the title clearly convey the main value?
2Opening hookDoes the opening grab attention and set expectations?
3Logical structureAre ideas logically organized (not random)?
4Argument clarityAre claims supported with evidence or reasoning?
5ConcisenessIs unnecessary padding or repetition removed?
6Transition flowDo paragraphs/sections flow smoothly?
7Closing strengthDoes the conclusion summarize and inspire action?
8Tone consistencyIs the tone consistent throughout?
9ReadabilityIs sentence/paragraph length varied appropriately?
10Audience matchDoes language match the target audience level?

Present the 10 questions, numbered 1-10. Ask the user to select which ones to activate (e.g., "use questions 1, 3, 5, 7"). Default: use all 10 if user doesn't specify.

Step 3 — Prepare Test Cases

  • Skill mode: Generate 3-5 realistic prompts a user would send when using the skill
  • Prompt mode: Generate 3-5 test inputs that the prompt would process
  • Article mode: Generate 3-5 ways the article might be read or consumed

Store test cases in context — do not write to disk.

Step 4 — Run Autoresearch Loop

Loop configuration:

  • Rounds per batch: 30
  • Max total rounds: 100
  • Pause: After every 30 rounds, show summary and ask user to continue or stop
  • Stop conditions: User says stop, OR 100 rounds completed

Per-round procedure:

  1. Mutate: Make ONE small edit to the target content:

    • Skill mode: edit SKILL.md
    • Prompt mode: edit the prompt string
    • Article mode: edit the article text
  2. Test: For each test case, simulate what output the content would produce.

  3. Score: Apply each active checklist question (0 or 1 per question). Score = (passed / total) × 100.

  4. Decide: If new score ≥ best score → keep the mutation. If lower → revert.

  5. Log: Round number, mutation type, score, keep/revert decision.

Mutation types (pick one per round):

TypeDescription
AAdd a constraint rule
BStrengthen trigger/coverage
CAdd a concrete example
DTighten vague language
EImprove error/edge case handling
FRemove redundant content
GImprove transitions
HExpand a thin section
IAdd cross-reference
JAdjust degree-of-freedom

Step 5 — Report Results

After each batch (30 rounds):

Batch N (rounds X-Y):
  Best score: XX%
  Mutations kept: N  |  Reverted: N
  Most effective types: [list top 2-3]
Accumulated improvements: [summary]
Continue? (yes/stop)

After full completion:

  • Original score vs. final score
  • Top 3 most impactful mutations
  • Final improved content (inline or diff)
  • File path (skill mode only)

Mutation Strategy Reference

High-impact, low-risk changes:

  • Adding explicit constraints where the content is vague
  • Expanding coverage to cover edge cases
  • Adding concrete examples to abstract instructions
  • Tightening soft language ("try to" → "must")

Avoid in one round:

  • Large rewrites of entire sections
  • Multiple unrelated changes at once
  • Changing fundamental scope or purpose

See references/mutation_strategies.md for the full strategy guide.


Mode Selection Quick Reference

User saysMode
"optimize [skill]" / "autoresearch [skill]"Skill
"optimize this prompt" / "improve my prompt"Prompt
"polish this article" / "improve this article"Article
"optimize this document"Article

Default to Prompt mode if the input is a text string without a skill path.

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…