Auto Improvement Orchestrator Skill
Security checks across malware telemetry and agentic risk
Overview
The skill’s auto-improvement purpose is coherent, but it can execute evaluated code, edit skills, read Claude session logs, store feedback snippets, and use a Claude account, so it needs careful review before use.
Install only if you are comfortable running an auto-improvement tool with access to local skill files and, optionally, Claude session logs. Prefer a virtual environment, a disposable git branch or container, mock mode first, narrow log paths instead of all `~/.claude/projects`, and manual review of generated diffs, receipts, feedback stores, and LLM costs.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A malicious or buggy skill being evaluated could run code on your machine under your account.
Python importlib `exec_module` runs module top-level code. In a skill discriminator/evaluator that handles candidate skills, this can execute untrusted evaluated code if not sandboxed.
spec.loader.exec_module(self.skill_module)
Evaluate only trusted skills, or run evaluations in a sandbox/container with limited file and network access; document and enforce allowed paths before dynamically loading candidate code.
Private prompts, project names, corrections, or accidental secrets from local sessions could be retained and reused to steer future skill changes.
The design reads broad local Claude Code session logs, stores derived user-message snippets persistently, and uses that feedback in later improvement loops.
`~/.claude/projects/**/*.jsonl` ... `Written to feedback-store/feedback.jsonl (append-only)` ... `user_message_snippet`
Do not point it at all of `~/.claude/projects` unless you have reviewed the logs; add redaction, path allowlists, retention limits, and a review step before feedback is reused.
Approved or low-risk changes can still alter your local skill files and affect later agent behavior.
The skill can automatically apply some local document changes, but the artifacts disclose that behavior and describe a human-review boundary for higher-risk candidates.
Only `low`-risk document candidates auto-execute; everything else enters human review.
Run it in a version-controlled branch, inspect diffs and receipts, and use dry-run or manual review before accepting generated changes.
Running non-mock evaluations may send skill/task content to the LLM provider and incur costs.
The skill can use the local authenticated Claude CLI for evaluation, which may consume the user's Claude account quota or paid usage.
`--standalone --mock # remove --mock for real claude -p` ... `LLM-as-judge (`claude -p`, ~$0.5/eval)`
Use mock mode unless you intend to spend provider credits, confirm which account the CLI is logged into, and set cost or iteration limits.
You are trusting the external repository contents and current package versions at install time.
The setup path is user-directed, but it relies on an external repository and unpinned package installation while the registry metadata lists no install spec.
git clone https://github.com/lanyasheng/auto-improvement-orchestrator-skill.git && cd auto-improvement-orchestrator-skill && pip install pyyaml pytest
Review the repository, pin dependency versions, and install in a virtual environment before running the scripts.
A mistaken improvement loop could propagate low-quality instructions across several local skills.
The skill is explicitly designed for batch and continuous auto-improvement, so a bad scoring signal or accepted change could affect multiple skills if not contained.
批量改进多个 skill(autoloop 连续运行)
Limit targets, set low iteration counts at first, keep backups, and review changes before promoting them to shared or production skill directories.
