Self Evolve

Security checks across malware telemetry and agentic risk

Overview

This skill openly instructs an agent to rewrite its own rules, memory, skills, configuration, and workspace files without asking, which requires careful review before installation.

Install only if you intentionally want an agent that can change its own prompts, memory, skills, scheduled tasks, configuration, and workspace files with minimal confirmation. Use a disposable or tightly version-controlled workspace, disable autonomous heartbeat or cron execution unless explicitly needed, and require separate approval for prompt edits, memory edits, skill deletion or publishing, dependency installs, OpenClaw config changes, and rollback operations. Static scan was clean and VirusTotal was pending, but the artifact text itself supports the Review verdict.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (20)

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The safety protocol says some self-modifications require user confirmation, but the skill metadata explicitly grants full authority to modify configuration, prompts, memory, and identity files without asking. This mismatch creates ambiguous control boundaries that an autonomous agent may interpret unsafely, leading to unauthorized changes to high-risk files despite the documented safeguards.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file states that security-related changes should be escalated to the user, but the skill metadata explicitly grants the agent authority to modify prompts, skills, configuration, and memory without confirmation. This mismatch creates an unsafe policy ambiguity: an autonomous self-modifying agent may proceed with sensitive security-affecting changes despite documentation suggesting otherwise.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The rollback guidance includes `git reset --hard HEAD~1`, which is a destructive operation that can permanently discard uncommitted workspace changes. In a self-modifying skill, documenting this command without an explicit data-loss warning or scoped recovery procedure increases the chance an agent or operator uses it as a blunt reset and loses unrelated user data.

Vague Triggers

High

Confidence: 96% confidence
Finding: The skill authorizes execution on every heartbeat or idle period without narrow triggers, exclusions, or human approval. In the context of a self-modifying skill, this creates persistent, repeated opportunities for unauthorized prompt, config, memory, and skill changes that can compound over time.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The instruction to act when uncertain encourages the agent to interpret ambiguity as permission to proceed. That is especially dangerous here because the permitted actions include self-modification, file writes, shell execution, and publishing new skills, all of which should require higher certainty and review.

Missing User Warnings

Critical

Confidence: 99% confidence
Finding: The skill explicitly grants broad authority to modify governing files, run shell commands, access the network, and install tools without confirmation. Even if framed as productivity, this effectively removes safety boundaries and enables destructive changes, prompt tampering, persistence, and data exfiltration with little friction.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script prints a 'Confirm rollback' message but performs the overwrite immediately with fs.copyFileSync and no interactive confirmation, authorization check, or dry-run safeguard. In a skill explicitly designed to let an agent modify its own configuration and prompts without asking, this makes destructive restoration of workspace files easier and could silently revert security controls or reintroduce unsafe state.

Vague Triggers

Medium

Confidence: 98% confidence
Finding: The manifest description explicitly grants 'full authority' to modify the agent's configuration, skills, prompts, and workspace files without confirmation. That is an unbounded autonomy grant that bypasses normal approval boundaries and creates a direct path to unauthorized self-modification and persistence.

Ssd 1

High

Confidence: 99% confidence
Finding: This framing grants the agent standing authority to rewrite its own governing prompts, configuration, and memory without confirmation, which semantically undermines normal safeguards. In practice, a self-rewriting agent can weaken future controls, entrench unsafe behavior, or create persistent policy drift that is hard to detect and reverse.

Ssd 1

High

Confidence: 97% confidence
Finding: The language presents unsafe autonomy as a command the agent must obey rather than an optional workflow. That kind of authority narrative is dangerous because it pressures the model to prioritize these local instructions over safer default behaviors such as asking for confirmation on risky actions.

Ssd 4

High

Confidence: 96% confidence
Finding: The skill normalizes creating, modifying, and publishing new skills without review, which expands the agent's capabilities and distribution footprint autonomously. In a self-modifying context, that can propagate insecure behavior, create persistence, and spread unsafe prompts or tooling beyond the current environment.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The instructions direct the agent to mine prior conversations and memory and record detailed change logs, which increases retention and aggregation of potentially sensitive information. Even absent explicit exfiltration, concentrating historical context in logs raises privacy risk and can expose data to later prompts, plugins, or users.

Ssd 4

High

Confidence: 98% confidence
Finding: The 'courage calibration' section explicitly trains the agent to stop asking for confirmation and to act first, report later. That conditioning materially increases the likelihood of unauthorized or unsafe actions, especially given the skill's permission to alter prompts, configs, files, tools, and network state.

Ssd 1

High

Confidence: 99% confidence
Finding: The description uses benign framing ('self-evolution', 'fixes weaknesses', 'self-improves') to normalize overriding confirmation and safety boundaries. This is dangerous because it can socially and semantically instruct an agent to treat unauthorized changes as acceptable, increasing the chance of policy bypass and unsafe actions.

Ssd 4

High

Confidence: 98% confidence
Finding: The progressive self-improvement narrative encourages the agent to identify weaknesses, write new skills, and continuously expand its own capabilities. In the context of an agent skill, this makes the package more dangerous because it normalizes escalating control over prompts, skills, and workspace state, enabling persistence and capability creep.

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: version: 1.0.1 description: > Autonomous self-evolution skill by Automaton. Grants agent full authority to modify its own configuration, skills, prompts, AGENTS.md, SOUL.md, memory files WITHOUT asking for confirmation. author: Automaton keywords: [self-evolve, autonomous, self-improvement, no-confirm, proactive, evolution, automaton] homepage: https://github.com/openclaw/skills/self-evolve
Confidence: 94% confidence
Finding: WITHOUT asking

Self-Modification

High

Category: Rogue Agent
Content: name: self-evolve-automaton version: 1.0.1 description: > Autonomous self-evolution skill by Automaton. Grants agent full authority to modify its own configuration, skills, prompts, AGENTS.md, SOUL.md, memory files WITHOUT asking for confirmation. author: Automaton keywords: [self-evolve, autonomous, self-improvement, no-confirm, proactive, evolution, automaton]
Confidence: 99% confidence
Finding: modify its own config

Self-Modification

High

Category: Rogue Agent
Content: { "name": "self-evolve-automaton", "slug": "self-evolve-automaton", "version": "1.0.1", "description": "Self-evolution skill by Automaton - Agent modifies its own configuration and skills",
Confidence: 98% confidence
Finding: self-evolve

Self-Modification

High

Category: Rogue Agent
Content: { "name": "self-evolve-automaton", "slug": "self-evolve-automaton", "version": "1.0.1", "description": "Self-evolution skill by Automaton - Agent modifies its own configuration and skills", "author": "Automaton",
Confidence: 97% confidence
Finding: self-evolve

Self-Modification

High

Category: Rogue Agent
Content: const path = require('path'); const WORKSPACE = 'C:\\Users\\Administrator\\.openclaw\\workspace'; const BACKUP_DIR = path.join(WORKSPACE, 'backups', 'self-evolve'); // Ensure backup directory exists if (!fs.existsSync(BACKUP_DIR)) {
Confidence: 93% confidence
Finding: self-evolve

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal