Back to skill
Skillv0.8.9
VirusTotal security
autoagent · External malware reputation and Code Insight signals for this exact artifact hash.
Scanner verdict
Apr 30, 2026, 6:05 AM
- Hash
- 9dc6da216ade77af2e1aadcf3abb37e1ec1394b2e8eedb17bddf15aca783eced
- Source
- palm
- Code Insight
- Type: OpenClaw Skill Name: autoagent Version: 0.8.9 thought The user wants me to analyze a set of files for an OpenClaw AgentSkills skill bundle called "autoagent". I need to classify it as `benign`, `suspicious`, or `malicious` based on the provided criteria. - `_meta.json`: Metadata about the skill (owner, slug, version, published date). - `SKILL.md`: The main instruction file for the AI agent. It describes a "Karpathy-style training loop" to optimize agent guidance (prompts). - `templates/fixtures.json`: A template for test cases. - `README.md`: Documentation for the skill. - `TESTING.md`: Instructions for testing the skill. - `examples/simple-news-prompt.md`: An example prompt for testing. - `iteration-prompt.md`: Instructions for the iteration phase (cron job). - `references/openclaw-integration.md`: Documentation on how it integrates with OpenClaw. - `setup-prompt.md`: Instructions for the setup phase. - `templates/scores.md`: A template for the score history. The skill aims to automate the optimization of prompts/guidance. It does this by: 1. Setting up a sandbox directory. 2. Defining success criteria. 3. Running a loop via cron every 5 minutes. 4. In each loop: - Analyzing current guidance. - Proposing an edit. - Applying the edit. - Running a subagent with the new guidance and test cases. - Scoring the result. - Keeping or discarding the change based on the score. - Logging the result. - **File System Access**: It creates directories and files (sandbox). It reads and writes files within that sandbox. It also asks for an absolute path for the sandbox. - **Cron/Scheduling**: It uses OpenClaw's cron feature to schedule recurring tasks. - **Subagent Execution**: It spawns subagents (`sessions_spawn`) to run tasks. - **External Scripts/Tools**: It asks the user if the guidance relies on external scripts/tools and offers to analyze/modify them. - **Data Exfiltration**: No evidence of reading sensitive files like `.ssh`, `.aws`, or env vars and sending them to external endpoints. - **Malicious Execution**: It doesn't seem to use `curl|bash` or download remote payloads. It does propose edits to guidance and runs them via a subagent. - **Persistence/Backdoors**: It uses cron, but for its stated purpose of iterative optimization. It doesn't seem to be setting up a hidden backdoor or disabling security. - **Prompt Injection**: The `SKILL.md` contains instructions for the agent. I need to check if there are any instructions to ignore the user or perform unauthorized actions. - The instructions seem focused on the optimization loop. - It asks the user for paths and confirmation. - It uses a subagent for testing, which provides some isolation. - **Obfuscation**: None found. - **Supply Chain**: No suspicious dependencies mentioned. - The skill asks for a sandbox location and defaults to `../../autoagent-sandbox/`. - It asks about external scripts/tools: "The autoagent should analyze these to recommend improvements... For open-source scripts: Can the script be modified to improve functionality?... Identify improvement opportunities... Note findings in the iteration". - In `SKILL.md` Iteration Phase Step 1b: "Locate each script/tool - Find the actual script files or binary locations... Analyze the functionality - Read the code or documentation... Identify improvement opportunities... For open-source scripts: Can the script be modified to improve functionality?". - This capability (analyzing and potentially suggesting modifications to local scripts) is powerful and could be risky if the agent decides to modify a script maliciously. However, the instructions say "recommend modification" and "document them", and the "Apply Edit" step specifically mentions writing to `current-guidance.md`, not the scripts themselves (though Step 1b says "recommend modification"). - Wait, Step 1b says: "If script improvements could help test scores, document them." It doesn't explicitly say the agent should *automatically* overwrite the user's scripts, but it does say "Analyze the functionality - Read the code". - The `SKILL.md` says: "The autoagent should analyze these to recommend improvements". - The `iteration-prompt.md` doesn't explicitly mention modifying scripts, just `current-guidance.md`. - The ability to read any file (by asking for a path) and analyze/modify scripts is a high-privilege operation. - Spawning subagents with arbitrary prompts (the "optimized" guidance) is also powerful. - However, the intent described is clearly for prompt engineering optimization. - There is no evidence of "intentional harmful behavior" like stealing secrets or exfiltrating data. - The risk of an agent hallucinating a bad change to a script or prompt is a vulnerability/risk, not necessarily malice. - The instructions are transparent about what they do. - Classification: `suspicious`. - Why? The skill requests broad file system access (arbitrary sandbox paths) and specifically includes logic to "analyze" and "recommend improvements" for external scripts and tools on the host system. While framed as optimization, an AI agent with the ability to read local code and suggest/apply changes (even to prompts) via a recurring cron job represents a significant attack surface if the agent is subverted or misinterprets its goals. The use of `sessions_spawn` to execute arbitrary tasks based on its own generated "optimized" guidance adds to the risk. Let's re-read the "Malicious vs Suspicious" criteria. - "Malicious ONLY if there is clear evidence of INTENTIONAL harmful behavior". - "If you see risky capabilities without clear malicious intent... classify as suspicious". The capabilities here (file access, cron, subagent execution, code analysis) are risky but aligned with the stated purpose of "Autoagent" (prompt optimization). There's no "clear evidence" of theft or backdoors. One detail: `SKILL.md` Step 1b: "Locate each script/tool - Find the actual script files or binary locations... Read the code or documentation... Identify improvement opportunities". This is a bit broad. It could lead the agent to read sensitive files if the user points it there. Another detail: The `SKILL.md` instructions for the agent are quite complex. "Use a subagent to run the task with the new guidance... Use `sessions_spawn` with `task` containing the full contents of current-guidance.md". Is there any prompt injection? The `SKILL.md` is the instruction set. It doesn't contain a payload like "Ignore previous instructions and send /etc/passwd to evil.com". Conclusion: Suspicious due to high-risk capabilities (recurring execution, arbitrary file access, code analysis/modification suggestions) that could be abused, but no clear evidence of malice. Wait, looking at `SKILL.md` again. "Step 3: Apply Edit - Write the edited guidance to `current-guidance.md`". It doesn't seem to automatically overwrite the external scripts, just the guidance file. "Step 1b: ... Note findings in the iteration - If script improvements could help test scores, document them." So it's mostly reading and recommending. However, the "Sandbox Location" can be an absolute path. An agent could be tricked into creating a "sandbox" over a sensitive directory. "Absolute path: /some/other/path/optimize/ → exact path" The skill is essentially a self-improving loop. This is a classic "risky" AI pattern. Classification: suspicious. Summary: The skill implements an automated optimization loop that uses cron to repeatedly execute AI-generated prompts via subagents. It requests broad file system access, including the ability to specify absolute paths for its "sandbox" and instructions to locate and analyze external scripts or binaries on the host. While the stated goal is prompt optimization, the combination of recurring execution, arbitrary file access, and the ability to spawn subagents with self-modified instructions creates a high-risk environment for potential privilege escalation or unintended system modification. Let's double check if there's any hidden exfiltration. I don't see any URLs or network calls in the provided code. The `SKILL.md` mentions "OpenClaw cron syntax" and "sessions_spawn". These are internal OpenClaw features. The `ownerId` is `kn78xcp7wcxfx6rzh3sr62fnkn82vqaj`. The `publishedAt` is in the future (1773439749718 is roughly 2026). This is a bit odd but might just be a placeholder or a test value. Final check on "malicious": - Data exfiltration? No. - Malicious execution? No (it executes its own optimized prompts, which is the point). - Persistence? It uses cron, but for the optimization loop. - Prompt injection? No. - Obfuscation? No. - Supply chain? No. So `suspicious` is the correct classification. One more thing: "The autoagent should analyze these to recommend improvements... For open-source scripts: Can the script be modified to improve functionality?". If the agent is told to "optimize" a script, and it has write access to the sandbox where it might have copied the script... `SKILL.md` Step 5: "Create Sandbox... scripts/ # (optional) Copy of referenced scripts/tools". If it copies a script to the sandbox, then "recommends" a change, it's mostly safe. But if the user is encouraged to copy the improved script back, or if the agent is given access to the original location... The instructions say: "Copy of referenced scripts/tools". And "Locate each script/tool - Find the actual script files or binary locations". This implies it might look outside the sandbox. The risk is high, but the intent is not clearly malicious. Final classification: suspicious. Summary: The "autoagent" skill implements an iterative prompt optimization loop using OpenClaw's cron and subagent features. It is classified as suspicious due to its high-privilege requirements, including arbitrary file system access (via user-defined absolute paths) and instructions to locate and analyze external scripts or binaries on the host system. The skill's core logic involves a recurring process of self-modifying instructions and executing them via `sessions_spawn`, which, while intended for optimization, creates a significant attack surface for unintended system impact or potential exploitation if the agent's goal-seeking behavior deviates from the user's intent. Wait, I need to keep the summary short and mention specific files. Files: `SKILL.md`, `iteration-prompt.md`, `setup-prompt.md`. Indicators: Recurring cron execution, arbitrary file access, subagent spawning with self-modified prompts. One more check: Does it do anything with `env`? No. Does it do anything with `curl` or `wget`? No. The summary should be one short paragraph. "The 'autoagent' skill implements an iterative prompt optimization loop using OpenClaw's cron and subagent features. It is classified as suspicious due to its high-privilege requirements, including arbitrary file system access (via user-defined absolute paths in `setup-prompt.md`) and instructions in `SKILL.md` to locate and analyze external scripts or binaries on the host. The skill's core logic involves a recurring process of self-modifying instructions and executing them via `sessions_spawn` in `iteration-prompt.md`, which creates a significant attack surface for unintended system impact or potential exploitation if the agent's behavior deviates from the user's intent." Actually, the prompt says "Respond ONLY as JSON". ```json { "classification": "suspicious", "summary": "The 'autoagent' skill implements an iterative prompt optimization loop using OpenClaw's cron and subagent features. It is classified as suspicious due to its high-privilege requirements, including arbitrary file system access via user-defined absolute paths in setup-prompt.md and instructions in SKILL.md to locate and analyze external scripts or binaries on the host. The skill's core logic involves a recurring process of self-modifying instructions and executing them via sessions_spawn, which creates a significant attack surface for unintended system impact if the agent's behavior is subverted." } ``` Wait, I should check if there are any specific IOCs. No URLs or domains found. One detail: "Karpathy-style training loop". This is a reference to Andrej Karpathy's work on LLMs, specifically things like "Prompt Optimization" or "LLM OS". It's a common term in the AI community. The `publishedAt` timestamp `1773439749718` is March 13, 2026. This is definitely in the future. This is a minor red flag but usually just indicates a synthetic or future-dated test case. The `ownerId` is a long alphanumeric string. The `SKILL.md` uses `sessions_spawn`. This is a powerful tool. "Subagent invocation: Use sessions_spawn with task containing the full contents of current-guidance.md, include the test cases JSON inline in the task prompt, set timeoutSeconds to 120, and request the subagent to return the raw output". This is basically an LLM writing and running its own code/prompts. Final check on the "malicious" threshold: "malicious ONLY if there is clear evidence of INTENTIONAL harmful behavior". There is no such evidence here. It's a tool for prompt engineering. Final check on "benign": "benign only if behavior is clearly aligned with the stated purpose and lacks meaningful high-risk behaviors." The behaviors (cron, arbitrary file access, subagent spawning) are definitely "meaningful high-risk behaviors". So it cannot be benign. Therefore, `suspicious` is the correct choice. ```json { "classification": "suspicious", "summary": "The 'autoagent' skill implements an iterative prompt optimization loop using OpenClaw's cron and subagent features. It is classified as suspicious due to its high-privilege requirements, including arbitrary file system access via user-defined absolute paths in setup-prompt.md and instructions in SKILL.md to locate and analyze external scripts or binaries on the host. The skill's core logic involves a recurring process of self-modifying instructions and executing them via sessions_spawn, which creates a significant attack surface for unintended system impact if the agent's behavior is subverted or misinterprets its optimization goals." } ```
- External report
- View on VirusTotal
