Back to skill
Skillv1.0.0
ClawScan security
Ab Test Agent Workflow 1.1.0 · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignApr 19, 2026, 10:56 AM
- Verdict
- benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill is an internally consistent A/B (double-blind) multi‑agent testing workflow: its instructions, included helper scripts, and requested resources align with the stated purpose and it does not request credentials or external installs.
- Guidance
- This skill appears to do what it says: coordinate A/B blind comparisons using subagents, anonymize outputs, and have a judge score them. Before installing or running: 1) review the included scripts locally (anonymizer.py, judge_prompts.py, runner.py) — do not run them unexamined on sensitive data; 2) be cautious about executing any code that contestants produce (the prompts encourage contestants to output runnable code); run such code only in a secure sandbox; 3) the provided runner.py in the package preview appears truncated in the listing you were given (an unfinished line near the end) — ensure you have a complete, syntactically valid copy before use; 4) confirm that spawning subagents (sessions_spawn) and any model calls will occur on the platform you expect (these are platform features, not external network calls embedded in the skill); and 5) if you need stronger guarantees about identity removal, audit or extend anonymizer.IDENTITY_PATTERNS to match your models' signature phrases. Overall the package is coherent and proportionate to its purpose, but exercise normal caution when running or executing generated code.
Review Dimensions
- Purpose & Capability
- okName/description (multi‑agent double‑blind A/B testing) matches the included artifacts: SKILL.md describes coordinator/contestant/judge roles and the repo contains runner.py, anonymizer.py and judge_prompts.py which implement that workflow. No unrelated credentials, binaries, or config paths are requested.
- Instruction Scope
- noteSKILL.md directs the agent to spawn subagents (sessions_spawn) and to use the included scripts or inline prompts to run the workflow; it does not instruct reading arbitrary system files, harvesting env vars, or posting results to third‑party endpoints. Note: the workflow includes running/collecting model outputs and optionally running code-generation tasks — you should not automatically execute untrusted generated code without sandboxing.
- Install Mechanism
- okNo install spec is present (instruction-only + local scripts included). No downloads, package installs, or external installers are requested.
- Credentials
- okThe skill requests no environment variables or credentials. The code uses only standard libs and in‑memory data structures; there are no hidden credential accesses in the provided files.
- Persistence & Privilege
- okalways is false and the skill does not request persistent platform privileges or alter other skills' configuration. The included anonymizer stores mapping in memory and exposes it via APIs/CLI for report revelation — expected for the stated purpose.
