Ai Agent Evaluator

PassAudited by ClawScan on May 15, 2026.

Overview

This instruction-only skill appears purpose-aligned for evaluating AI agents, with the main caution that users may share sensitive logs or conversations during evaluation.

This appears safe to use as an advisory evaluation assistant. Before pasting logs, transcripts, or customer conversations, remove personal data, secrets, credentials, and internal prompts. Also note that the package has limited source metadata, although it contains no executable code.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Low

#ASI06: Memory and Context Poisoning

What this means

If you paste real logs or conversations, sensitive information could enter the model context during the evaluation.

Why it was flagged

The skill expects users to provide agent logs, failed transcripts, or sample conversations. That is aligned with evaluation work, but those materials can contain customer data, secrets, or adversarial prompt content.

Skill content

"Input": Agent logs, failed task transcripts ... "share 10 sample conversations (anonymized)"

Recommendation

Only share selected, relevant examples; redact secrets, personal data, customer identifiers, and internal system prompts before use.

Info

#ASI04: Agentic Supply Chain Vulnerabilities

What this means

Users have less information to verify the publisher, source history, or exact packaged version.

Why it was flagged

The registry metadata provides no source or homepage, and the SKILL.md front matter declares a different version, "3.0.0". This is a provenance/packaging clarity issue, though impact is limited because no code or install steps are present.

Skill content

Source: unknown; Homepage: none; Registry metadata version: 1.0.1

Recommendation

Treat this as a documentation/provenance note; prefer skills with clear source links and consistent version metadata when that matters for your workflow.