Skill Eval

Security checks across malware telemetry and agentic risk

Overview

This skill’s evaluation behavior is coherent and disclosed, but it spawns evaluation sessions and keeps histories, so use it with test data and safe prompts.

This skill appears appropriate for evaluating OpenClaw skills. Before installing, plan to run it in a test workspace, use non-production prompts and credentials, and review or delete retained eval histories and output folders after use.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

ASI02: Tool Misuse and Exploitation

Low

What this means

An eval can run multiple agent sessions and those sessions may use tools available in the environment.

Why it was flagged

The skill intentionally launches subagent sessions to run eval tasks. This is core to the evaluation purpose, but eval prompts can cause real tool activity depending on the evaluated skill.

Skill content

sessions_spawn(task=task, mode="run", cleanup="keep", label="trigger-eval-{id}")

Recommendation

Use read-only or test eval prompts, especially for skills connected to production services or accounts.

ASI03: Identity and Privilege Abuse

Low

What this means

If the evaluated skill has access to credentials or real services, eval sessions may also be able to use them.

Why it was flagged

Inherited sandboxing is disclosed and helps the eval access registered skills, but it means spawned sessions may share the user’s local environment and available tool permissions.

Skill content

`sandbox="inherit"` — subagents must inherit the skill registration environment

Recommendation

Run evaluations in a test workspace or with limited-scope/test credentials where possible.

ASI06: Memory and Context Poisoning

Medium

What this means

Evaluation output folders and retained sessions may contain private prompts, tool outputs, paths, or service data.

Why it was flagged

The skill stores full evaluation records, including tool calls and results, which may include sensitive information from test sessions.

Skill content

Keep full records — save `full_history.json` (including tool_use + tool_result)

Recommendation

Avoid running evals with secrets or private production data, and clean up retained sessions/output directories when no longer needed.

ASI01: Agent Goal Hijack

Low

What this means

A malicious or adversarial eval transcript could distort grading results or recommendations.

Why it was flagged

Raw transcripts are inserted into grader prompts for analysis. This is expected for grading, but transcript text could contain instructions that try to influence the grader.

Skill content

--- VARIANT A TRANSCRIPT ---\n{variant_a_transcript}\n\n--- VARIANT B TRANSCRIPT ---\n{variant_b_transcript}

Recommendation

Treat grader outputs as evaluative evidence, not authority; consider adding explicit instructions that transcript content is untrusted data.