Evaluation Benchmark

v1.0.0

Agent评估测试助手。设计评估指标、构建测试集、生成报告。使用场景:(1) 设计评估指标,(2) 构建测试集,(3) 执行评估测试,(4) 分析评估结果。

0· 59·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name, description, and SKILL.md all describe evaluation/benchmark tasks; there are no unrelated environment variables, binaries, or installs requested that would be inconsistent with an evaluation helper.
Instruction Scope
SKILL.md contains only high-level prompts/examples for designing metrics, building test sets, running evaluations, and analyzing results — it does not instruct the agent to read system files, exfiltrate data, or call external endpoints beyond normal conversational behavior.
Install Mechanism
No install spec and no code files are provided (instruction-only), so nothing is written to disk or fetched during install; this is the lowest-risk pattern.
Credentials
No environment variables, credentials, or config paths are required; requested privileges are proportionate to the stated purpose.
Persistence & Privilege
always is false and the skill is user-invocable; it does not request persistent presence or modify other skills or system settings.
Assessment
This skill appears coherent and low-risk because it is instruction-only and requests no credentials or installs. Before using it, note that: (1) it provides high-level prompts/examples only — it won't actually run tests or produce artifacts by itself; (2) any test data or model outputs you feed into the agent may contain sensitive information, so avoid submitting secrets or proprietary datasets; and (3) because there is no source/homepage or code, verify results manually and treat outputs as advisory rather than authoritative.

Like a lobster shell, security has layers — review code before you run it.

latestvk9700getna8dabqjgyz8vwwncd84n2bj

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments