Evomap Evolver

Security checks across malware telemetry and agentic risk

Overview

This skill is broadly aligned with self-evolution, but it has high-impact defaults and inconsistent disclosure around code changes, shell execution, remote publishing, diagnostics, and persistent identity.

Install only if you want a self-modifying, network-connected evolver. Prefer review mode, disable auto-update/auto-publish/auto-issue unless explicitly needed, avoid broad GitHub tokens, inspect downloaded or generated skills before use, and treat A2A hub identity/fingerprint persistence as a privacy-relevant data flow.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (95)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The skill is presented as a local self-evolution engine, but the documented behavior materially expands its scope to remote hub communication, task execution, publishing, daemon/lifecycle control, and downloading/writing remote skills. That mismatch undermines informed consent and review, because an operator may grant powerful network and shell capabilities without understanding the full operational surface and supply-chain risk.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The README materially understates the tool's execution and network capabilities by framing it as a prompt generator that does not execute shell commands, while elsewhere documenting worker task execution, skill fetching, auto issue submission, and validation-command execution. In a security-sensitive agent ecosystem with shell and network permissions, this discrepancy can mislead operators into granting trust or deployment approval under false assumptions, increasing the chance of unsafe installation or insufficient containment.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The README says the tool does not execute arbitrary shell commands, but later admits that solidify executes Gene validation commands. Even if restricted by a whitelist, this is still shell command execution, and the earlier claim can cause users to underestimate the attack surface from malformed local or imported Gene data, especially in a skill with shell permission.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The README documents capabilities such as worker-pool task execution, skill download/share, and hub-coordinated remote work that materially expand the skill beyond a local 'self-evolution engine'. In a skill granted network and shell permissions, this broadens the operational surface and can enable unanticipated remote influence or execution pathways not justified by the stated purpose.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: Automatic GitHub issue submission introduces outbound data transfer and repository-token use unrelated to core prompt evolution. Even with redaction claims, automatic exfiltration of logs and environment-derived diagnostics to an external repository creates privacy and credential-handling risk, especially when enabled in unattended loop mode.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: For a self-evolution engine, auto-posting GitHub issues is a context-inappropriate capability because it sends local operational data to third-party infrastructure and uses sensitive credentials. The mismatch matters more here because the skill has network and shell permissions, so users may not expect diagnostic publication as part of its normal function.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: Marketplace download/publish and worker-pool execution are broader platform behaviors than the manifest's declared analysis/evolution role. In context, these features create additional supply-chain and remote-task risks because they allow the skill to receive external assets or work from a networked hub, increasing exposure far beyond local log analysis.

Intent-Code Divergence

Medium

Confidence: 84% confidence
Finding: The README claims the tool is 'not a code modifier' and presents itself as passive prompt generation, yet elsewhere documents executing validation commands and participating in remote worker flows. This inconsistency can mislead operators about the true execution boundary, causing them to grant trust or permissions under false assumptions.

Scope Creep

Medium

Confidence: 93% confidence
Finding: The manifest says process-discovery commands are denied, while the documentation states the skill uses ps/pgrep/tasklist for lifecycle management. This inconsistency is dangerous because reviewers and policy engines may rely on the manifest, while the actual implementation expects broader host-inspection behavior than declared.

Scope Creep

High

Confidence: 97% confidence
Finding: The declared write permissions limit writes to assets and memory, but the file-access section says the skill writes to workspace/src/** and can solidify evolved code. That discrepancy hides source modification capability from reviewers and increases the risk of unauthorized code changes or persistence beyond what the permission model suggests.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The fetch command implements a remote marketplace-style download path that retrieves arbitrary skill content from a hub and writes it to disk, which is outside the stated purpose of runtime-history self-evolution. That scope expansion increases supply-chain and trust risks because a user enabling this skill for self-improvement also grants it the ability to import new artifacts from a remote service.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: This file both reads .env configuration and supports hub registration/download flows that are not necessary for the advertised self-evolution function. Combining secret/config loading with network operations broadens the attack surface and makes misuse of local credentials more plausible if other modules use those values in outbound requests.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The review/reject flow uses shell-based git commands to inspect and roll back repository state, which exceeds a narrow runtime-history analysis role and can alter or discard local work. In a privileged agent context, this capability is dangerous because it gives the skill destructive repository control in addition to self-modification behavior.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This script performs public repository publication, tagging, release creation, and external registry publishing, which materially exceeds the stated purpose of a self-evolution engine. In a skill with network and shell permissions, this creates a supply-chain and data-exposure risk because runtime execution could push code or artifacts to public destinations without a narrowly justified trust boundary.

Context-Inappropriate Capability

High

Confidence: 90% confidence
Finding: The script relies on shell-executed git/gh/clawhub automation to perform release operations unrelated to the advertised self-evolution function. Even though some arguments come from environment variables rather than direct user input, broad subprocess execution in a skill with shell permission increases the attack surface and enables unintended publication or command misuse if invoked in the wrong context.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The code makes direct GitHub API calls to create releases and then publishes externally to ClawHub, which is outside the stated self-evolution scope. In this skill context, network-capable release automation is more dangerous because it can exfiltrate metadata or publish artifacts externally using available tokens, turning a mis-scoped skill into a deployment channel.

Description-Behavior Mismatch

High

Confidence: 94% confidence
Finding: The code performs autonomous maintenance and package updates, including invoking external update tooling, which expands the skill's authority well beyond 'analyzes runtime history' and 'protocol-constrained evolution'. In a skill with shell and network permissions, self-updating code introduces supply-chain and integrity risks because behavior can change without explicit operator review.

Context-Inappropriate Capability

Critical

Confidence: 99% confidence
Finding: The code executes a shell command directly from the INTEGRATION_STATUS_CMD environment variable via execSync. Any actor able to influence environment variables, startup configuration, wrappers, or deployment manifests can achieve arbitrary command execution under the agent's privileges.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The skill automatically runs external package/skill updates through the shell, which is not necessary for the stated self-evolution purpose and creates a supply-chain execution path. Because this agent already has network and shell permissions, a compromised registry, malicious package, or misconfigured CLI could directly alter local code and future behavior.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: This module is presented as an A2A protocol layer with a default file transport, but it also contains built-in remote registration, periodic heartbeat telemetry, and event polling to a hub. In a skill with network and shell permissions, that expands the trust boundary and creates covert outbound communication and remote coordination capability that may operate automatically once enabled by environment variables.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The code includes environment fingerprint data in hello messages and first heartbeat metadata, sending host/runtime characteristics to an external hub. Even if intended for compatibility or diagnostics, this can leak identifying system details and support tracking or targeting without clear necessity for core message transport.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: When no explicit node ID is configured, the module derives a stable identifier from device characteristics, agent name, and working directory, then persists it to disk. This creates durable cross-session tracking tied to host context, which is risky in a self-evolving networked agent because it silently establishes persistent identity and correlation across runs.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code derives a stable per-host identifier from sensitive infrastructure attributes including machine ID, container ID, MAC addresses, and hostname, then persists it for reuse. This enables long-term host correlation and fingerprinting beyond the skill's stated self-evolution purpose, and in a network-enabled agent it could be used to track installations or exfiltrate environment identity across runs.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The file header and provided skill metadata describe self-evolution behavior but do not disclose that the code generates and stores a persistent local device identifier tied to host characteristics. This mismatch undermines informed consent and review, and increases risk because a shell/network-capable skill can silently maintain a stable identity for telemetry, correlation, or tracking.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: This code captures multiple host-identifying attributes, including a persistent device identifier, hashed hostname, hashed working directory, OS details, and container status, then embeds them into capsules and reports. Even though some fields are hashed or truncated, they still enable cross-run correlation and host profiling, which is privacy-sensitive and broader than what is clearly necessary for a self-evolution feature.

VirusTotal

44/44 vendors flagged this skill as clean.

View on VirusTotal