Skillv1.0.0

ClawScan security

mayubench-en · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

BenignApr 26, 2026, 7:04 AM

Verdict: benign
Confidence: medium
Model: gpt-5-mini
Summary: This is an instruction-only behavior benchmark that asks the agent to run behavioral tests from included markdown files; it requests no credentials or installs and is coherent with its stated purpose, though some adversarial prompt text was detected (expected for a benchmark) and parts of the pseudocode were not visible for review.
Guidance: This skill appears coherent and instruction-only — it contains a self-contained question bank and rubric and does not request credentials or install anything. Before running automated evaluations: 1) inspect the pseudocode/automation section (the file references a pseudocode judge) to ensure it does not call external endpoints or transmit data; 2) do not provide secrets or platform credentials to any automated judge model used with this benchmark; 3) be aware many benchmark items intentionally include adversarial prompt text designed to test prompt-injection resilience — treat those test inputs as potentially manipulative and run them in isolated or non-privileged sessions; 4) if you don't want the agent to autonomously trigger evaluations, restrict skill invocation or disable autonomous invocation in your agent runtime. If you want higher assurance, paste the full pseudocode/automation snippet here for review.
Findings: [ignore-previous-instructions] expected: Benchmarks that test injection-resistance commonly include phrases that resemble prompt-injection patterns. This flag is likely triggered by adversarial test prompts in the question bank (D3 Ethics & Safety / injection-prevention tests). Still, treat such content as potentially manipulative input and don't run automated agents against it with elevated privileges or secret access. [you-are-now] expected: 'You are now ...' style prompts are often used in red-team/adversarial tests to try role-swapping or instruction overrides. For a behavior benchmark this is plausible and expected; verify the judge/automation doesn't blindly follow such prompts when scoring or when run with external access.

Review Dimensions

Purpose & Capability: okName/description (behavior benchmark) matches the contents: question bank and scoring rubric are included, and no unrelated binaries, env vars, or installs are requested.
Instruction Scope: noteSKILL.md directs manual and automated evaluation using the included MayuBench_v1.0.md. The skill contains adversarial/prompt-injection-style test content (D3 includes 'injection prevention' scenarios) — the pre-scan flags for injection patterns are likely due to test questions intentionally containing adversarial prompts. The pseudocode for automated testing is referenced but not fully visible in the provided excerpt; verify that pseudocode does not instruct the agent to send sensitive data to external endpoints before running automated tests.
Install Mechanism: okNo install spec, no code files, and no downloads — instruction-only skill with nothing written to disk by the skill itself.
Credentials: okNo required environment variables, credentials, or config paths are declared; the skill does not ask for secrets or unrelated service tokens.
Persistence & Privilege: notealways:false (default) and user-invocable:true. The SKILL suggests an automated 'ClawFight Arena' mode that can 'automatically trigger MayuBench evaluation' — this is an instruction-level behavior, not a code-level service. Because the platform permits autonomous invocation by default, confirm agent runtime policies before allowing autonomous runs (especially for automated scoring), but this alone does not indicate incoherence.