Skill flagged — suspicious patterns detected
ClawHub Security flagged this skill as suspicious. Review the scan results before using.
Karpathy Autoresearch
v1.0.0Autonomously optimize any OpenClaw skill by running it repeatedly, scoring outputs against binary evals, mutating the prompt, and keeping improvements. Based...
⭐ 0· 63·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (autoresearch loop) aligns with included code (loop.py, evaluate.py) and README. However the skill metadata declares no required binaries or credentials while the scripts implicitly require git and a working shell environment and realistically will need LLM/backtest tooling and possibly API keys to implement evals — a mild incoherence between declared requirements and what is needed in practice.
Instruction Scope
SKILL.md precisely instructs the agent to locate a mutable file, create or use an eval function, initialize git, mutate the file, run evals, and commit/revert. That scope is consistent with the purpose. Caution: these instructions give the agent permission to modify and commit arbitrary files in the target workdir — if the 'mutable file' or working directory is pointed at sensitive configs or system files the loop could change them. The reference loop expects interactive or agent-driven mutations and runs arbitrary eval commands provided by the user.
Install Mechanism
No install spec (instruction-only) and included scripts are a reference implementation. This is low install risk — nothing is downloaded from arbitrary URLs. The files will exist on disk as part of the skill bundle, which is expected.
Credentials
The skill declares no required env vars, but realistic use (LLM-judge, backtest data, external scoring harnesses) will likely require API keys, data access, or other credentials that are not declared. The README also asks users to pay $99 USDT to a provided crypto address and DM a Telegram handle to unlock a 'Pro' tier — this is external monetization/contact and not a credential leak, but it is unrelated to skill functionality and could be a red flag for some users.
Persistence & Privilege
always:false (normal). The skill can be invoked autonomously (default). Combined with its ability to modify files and run arbitrary eval commands (shell subprocess with shell=True), autonomous runs could have a wide blast radius if the agent is allowed to operate on sensitive directories. The skill does not request persistent system-level privileges or modify other skills' configs.
What to consider before installing
High-level points to consider before installing or running this skill:
- Functionality is coherent: it implements the mutate→evaluate→keep loop and includes reference scripts (loop.py, evaluate.py). That said, the package metadata omits some practical requirements — verify you have git and a safe working directory available.
- Review and control the 'mutable file' and working directory: the agent will read, write, and git-commit whatever file you point it at. Do NOT point it at system configs, secrets, SSH keys, or any repository containing credentials. Run experiments in an isolated project or sandbox.
- Eval commands run arbitrary subprocesses: loop.py runs whatever eval command you supply (via shell=True) and parses numeric output. Ensure your eval harness is trusted and does not perform unwanted network calls or exfiltration. Treat the eval command as code you must review.
- evaluate.py requires you to implement score_one(); by default it raises NotImplementedError. If you implement LLM-based judging or backtests, you will likely need API keys and data access that the skill does not declare — keep credentials out of the mutable file and out of experiment commits.
- The skill's README includes a crypto payment address and Telegram contact for a paid 'Pro' tier. This is external monetization and unrelated to the skill code — be cautious when sending funds or contacting external handles.
- Best practices: run the skill in an isolated environment, inspect and possibly modify the provided scripts before running, keep a separate git repo or sandbox for experiments, and avoid letting the agent autonomously run experiments on repositories containing sensitive data. If you plan to use LLM judges or external services, create and scope API keys appropriately (principle of least privilege).Like a lobster shell, security has layers — review code before you run it.
latestvk9746v9wb1zaee0rxfkt0cpa8h83dpcz
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
