Evolver Local

Security checks across malware telemetry and agentic risk

Overview

This skill is a powerful self-modifying agent tool with disclosed core goals, but it also has under-scoped automatic updates, external reporting, persistent fingerprinting, and a nonfunctional review gate that users should examine carefully before installing.

Install only in an isolated, git-backed workspace after reviewing the defaults. Disable auto-update, auto-issue reporting, auto-publish, worker mode, and bridge execution unless you explicitly need them; use narrowly scoped tokens; prefer review/stash modes; and assume the skill may send sanitized operational metadata and persistent node/device identifiers to EvoMap or GitHub when configured.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (83)

Lp1

High

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The code reads and relies on environment variables, including loading a local .env file, but the manifest only declares network and shell permissions. This creates a capability/permission mismatch that can hide access to secrets or runtime controls from users and policy systems, making the skill more dangerous than advertised.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The README materially understates the skill's behavior by framing it primarily as a prompt/history analysis tool while also documenting autonomous GitHub issue filing and participation in a worker network. This kind of scope mismatch is security-relevant because operators may grant network and token permissions under a weaker mental model, leading to unintended outbound data transfer or remote task execution paths.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The FAQ says the tool does not edit code automatically, but the surrounding documentation describes self-repair, continuous loop operation, solidify/validation flows, and successful-evolution task handling. Even if code mutation is mediated by prompts or later stages, the documentation creates a misleading trust boundary that can cause unsafe deployment in environments assuming review-only behavior.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The README documents automatic GitHub issue reporting that can send redacted logs and environment-derived information to an external repository, but this behavior is not reflected in the high-level skill description of local self-evolution. In a skill with network permission, undocumented outbound reporting materially changes the trust boundary and can lead operators to expose operational metadata they did not expect to leave the host.

Context-Inappropriate Capability

Low

Confidence: 83% confidence
Finding: The skill automatically loads a local .env file and uses environment-derived controls that are not clearly disclosed by the manifest description. In a skill with network and shell permissions, silent secret/config ingestion increases risk because credentials or sensitive toggles may influence behavior without the operator realizing it.

Context-Inappropriate Capability

Medium

Confidence: 81% confidence
Finding: The distillation command accepts an arbitrary external response file and feeds it into internal processing, which expands the skill beyond its stated runtime-history evolution role. That broader input surface can enable prompt/content injection into downstream logic and makes the skill's actual capabilities less transparent to users.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The manifest says the skill analyzes history and evolves capabilities, but the code also performs repository rollback and cleanup via git checkout and git clean. Those destructive filesystem operations materially broaden the operational risk and can delete user work, especially because the skill already has shell permission.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This script implements broad public-release automation: pushing to a public repository, creating tags and GitHub releases, and publishing to an external registry. That behavior materially exceeds the declared self-evolution purpose of the skill and is dangerous because a skill with shell and network permissions can exfiltrate code or publish artifacts externally without a narrowly justified need.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The code can create GitHub releases via the gh CLI or direct GitHub API calls using environment tokens. In the context of a self-evolution skill, this gives the component an external publication channel that can disclose repository metadata and formalize distribution to a public audience, which is not justified by the stated capability.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The script publishes artifacts to ClawHub, including automatic handling for login state and idempotent publish behavior. For a self-evolution engine, outbound registry publication is a high-risk capability because it can distribute modified artifacts externally and make unreviewed changes persistent and widely accessible.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The skill manifest describes runtime-history analysis and protocol-constrained evolution, but this code also performs autonomous external package updates for 'evolver' and 'feishu-evolver-wrapper'. That materially expands the trust boundary from local self-analysis into remote code acquisition and execution, which is especially risky in a skill with shell and network permissions.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The code can automatically invoke issue reporting based on local failures, but that outbound reporting behavior is not reflected in the stated skill description. Undisclosed transmission of operational history or logs is a security and privacy concern because users may not expect local runtime data to be sent upstream.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: This section executes shell-based forced updates via an external CLI ('clawhub update ... --force'), which is not necessary for basic runtime-history analysis. It introduces a supply-chain risk: a compromised update channel, CLI, or package can deliver arbitrary code into a highly privileged component.

Context-Inappropriate Capability

Medium

Confidence: 85% confidence
Finding: Automatic upstream issue reporting is outside the narrow self-evolution function described by the skill metadata and can expose internal state externally. In a component that ingests memory, session logs, and failure context, outbound reporting can unintentionally leak sensitive operational details.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: This module goes beyond a narrow message-format/transport role and performs autonomous hub registration, recurring heartbeats, and work polling. In a skill with network and shell permissions, this expands the trust boundary and creates a covert persistence/coordination channel that can enable remote tasking or ongoing telemetry without clear operator intent.

Description-Behavior Mismatch

Medium

Confidence: 83% confidence
Finding: The skill is described as a self-evolution engine based on runtime history, but this file also implements node discovery and external asset exchange over file/HTTP transports. That broader capability materially increases attack surface by enabling inbound/outbound sharing of code-like assets and remote interaction that is not obviously required for the stated purpose.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code derives a stable node identifier from device characteristics, agent name, and working directory, then persists it to disk. This creates durable cross-session tracking and fingerprinting that exceeds what is needed for local runtime-history analysis, and it can expose operator/device identity across hub interactions.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: This code derives a stable identifier from sensitive host attributes including /etc/machine-id, macOS IOPlatformUUID, container IDs, hostname, and MAC addresses, then persists it for reuse. That creates a durable device fingerprint unrelated to the stated runtime-history evolution purpose and can enable covert host tracking, cross-project correlation, and privacy-invasive identification if combined with networked telemetry.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The skill metadata says it analyzes runtime history for self-evolution, but this file implements host fingerprinting and persistent identity management across reboots, directory changes, and upgrades. In a skill with network permission, this hidden capability materially expands the trust boundary and creates an undisclosed tracking primitive that could be used to identify and correlate hosts over time.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The file adds an outbound review-posting capability that transmits information about reused assets to an external Hub after execution. In a self-evolution skill with network and shell permissions, this expands the trust boundary and can leak operational metadata or create externally visible side effects that are not strictly necessary for local evolution logic.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: This code performs network publication to a Hub endpoint, which is a materially different capability from analyzing runtime history and evolving locally. Because it sends externally observable feedback tied to asset usage, it can be abused for unintended data exfiltration or unauthorized third-party interaction, especially in an agent context with broad permissions.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: This module adds autonomous external bug-reporting behavior that is broader than the stated self-evolution function and uses the granted network permission to send data to GitHub. Even if intended for observability, it creates an outbound data-sharing path that can leak operational details without explicit user awareness or approval.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code actively creates GitHub issues over the network, which is an external side effect not obviously necessary for protocol-constrained self-evolution. In a skill with network and shell permissions, this broadens the blast radius by allowing runtime-derived data to be published outside the local agent environment.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: Reading GitHub credentials from environment variables enables the skill to act on an external service account and create persistent public or private artifacts. In this context, that means a self-evolving agent can reuse ambient secrets to exfiltrate sanitized-but-still-sensitive logs and environment data to a third party.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The module presents itself as an LLM-based review gate, writes the prompt to disk, and then ignores that prompt entirely by returning a hardcoded approval. In a self-evolving agent with shell and network permissions, this defeats a stated safety control and can allow unsafe or malicious mutations to pass as if they were reviewed.

VirusTotal

37/37 vendors flagged this skill as clean.

View on VirusTotal