Meta-Harness Evolver

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed agent-evolution automation, but it can automatically alter live agent configuration and send broader internal details to Discord than users would likely expect.

Install only if you intentionally want a scheduled process that can change an OpenClaw agent’s operating files and post internal evolution details to Discord. Review and gate the Discord reporter, evaluate candidates in an isolated copy of the workspace, require human approval before applying changes to live harness files, and treat the reported benchmark scores as heuristic rather than real task results.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (10)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill clearly describes filesystem reads/writes and shell execution, but it does not declare corresponding permissions. That creates a transparency and policy-enforcement gap: reviewers and runtime controls may underestimate what the skill can do, while the skill can still manipulate local files, run scripts, and access sensitive workspace content. In this context, the capability set is especially risky because the skill operates over agent configuration files and archived traces, which may include secrets, prompts, or sensitive operational data.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The declared description understates materially important behaviors: modifying the live Hoss workspace, copying candidate harnesses into operational config locations, directly posting to Discord, and handling additional files like HEARTBEAT.md. This mismatch prevents informed consent and can hide high-risk side effects, especially because the skill mutates active agent behavior and exfiltrates summaries externally. In a meta-evolution skill, those undeclared mutations are more dangerous than usual because they can recursively alter future agent operation.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The document’s initialization logic copies HEARTBEAT.md into the evolved harness, while the skill metadata describes the workspace inputs as including MEMORY.md instead. This kind of spec/implementation drift is dangerous in an automated self-modifying harness because the system may evolve or benchmark a different set of control files than operators expect, weakening oversight and causing unreviewed behavior changes to persist.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The file presents itself as a benchmark evaluator for 20 scenarios, but the implementation never executes those scenario tasks and instead assigns scores from superficial document heuristics. In a nightly evolution pipeline, this can systematically select bad or unsafe harness changes while producing misleading scores that appear authoritative.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The run_scenario routine claims to run a scenario and return a rubric-based score, but it only inspects files like SOUL.md and TOOLS.md for keywords and length. This creates a false assurance boundary: candidates can game the evaluation by inserting text patterns rather than actually improving agent behavior, which is especially risky in an automated self-modifying system.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The benchmark description promises coverage of web search, URL fetching, sub-agent coordination, communication, and memory tasks, but the code does not validate any of those capabilities. In this skill context, that mismatch is dangerous because the system is supposed to evolve an operational agent harness, and unevaluated capabilities may silently regress while still receiving good scores.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The script claims to post a summary, but it includes raw proposer reasoning and a change summary derived from harness files. That can disclose sensitive internal prompts, benchmark strategies, or other workspace content to an external Discord channel, exceeding the stated data-sharing scope.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill logs execution traces and results to disk and posts summaries to Discord without a clear upfront warning in the user-facing description. That omission can expose sensitive prompts, benchmark content, configuration details, or operational metadata to persistent storage and external channels without informed user awareness. Because this skill processes evolving harness and workspace data nightly, the cumulative disclosure risk is significant.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The evaluator copies candidate-provided harness files directly into the live Hoss workspace without confirmation, isolation, or validation. Because this skill performs automated evolution, a malformed or adversarial candidate can overwrite operational configuration and influence subsequent agent behavior during evaluation or beyond if restore fails.

Ssd 3

Medium

Confidence: 98% confidence
Finding: Posting the proposer's reasoning trace to an external Discord channel can leak sensitive model outputs, internal instructions, or user-derived content in plain language. In an agent-evolution workflow, reasoning traces may contain highly sensitive implementation details, making external disclosure particularly risky.

VirusTotal

VirusTotal engine telemetry is currently stale for this artifact.

View on VirusTotal