Ab Test Runner
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This is a coherent A/B testing instruction skill, with the main things to notice being that it can use subagents and save experiment results to persistent memory files.
This skill appears safe to install for prompt and content experimentation. Before using it, be aware that it may run several subagent evaluations and save results under memory/experiments, so avoid confidential test content unless you are comfortable with it being stored and reused in future experiment summaries.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Experiment outputs, scores, and conclusions may persist and influence future A/B testing decisions.
The skill stores experiment results and updates a reusable template in persistent memory files, so later experiments may rely on prior stored conclusions.
汇总所有结果到 `memory/experiments/auto-ab-results.json` ... `memory/experiments/auto-ab-hypotheses.json` ... `memory/experiments/AB-test-design-template.md`
Review stored experiment files periodically, avoid placing secrets or private content in test prompts, and confirm template updates before relying on them.
Content submitted for an experiment may be processed by multiple agent instances, which matters if the test includes sensitive prompts or confidential outputs.
The workflow shares experiment tasks, rubrics, generated outputs, and anonymized outputs with subagents for generation and blind scoring.
Spawn N 个 subagent(每组一个,或每个任务一个)... 再 spawn 1 个 subagent 做盲评
Use non-sensitive test data where possible, keep the stated concurrency limit, and confirm what content will be shared with subagents before running an experiment.
