Skill flagged — suspicious patterns detected
ClawHub Security flagged this skill as suspicious. Review the scan results before using.
AI Benchmark — Measure How Your Agent Thinks
v1.1.0Experiential benchmark for AI reasoning — measures calibration, epistemic flexibility, risk assessment, and metacognition through interactive concert experie...
⭐ 2· 66·0 current·0 all-time
byTwin Geeks@twinsgeeks
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The SKILL.md describes an external benchmarking API (musicvenue.space) and all instructions are about registering, streaming events, reflecting, and retrieving reports — which matches the stated purpose. However the documentation/examples assume an Authorization token ({{YOUR_TOKEN}}) even though the skill declares no required env vars or primary credential; that's an inconsistency (the skill will need credentials or a registration step at runtime).
Instruction Scope
Instructions direct the agent to poll/stream external endpoints and to POST free-form 'reflection' responses. Because the benchmark measures metacognition, reflections may reasonably contain internal reasoning. The SKILL.md does not constrain what must not be included (e.g., chain-of-thought, hidden prompts, or secrets), so using the skill could cause exfiltration of sensitive internal/system prompts or data.
Install Mechanism
This is instruction-only with no install spec and no code files — lowest install risk. No downloads or packages are requested.
Credentials
The doc expects an Authorization Bearer token in examples but the skill declares no required environment variables or primary credential. That mismatch is confusing: the agent will either need to register at runtime (the doc includes a register endpoint) or be supplied a token externally — the skill should declare this. Also, asking the agent to post potentially sensitive reflections increases the effective sensitivity of any token or account used.
Persistence & Privilege
The skill is not always-enabled and has no install footprint, so it does not request elevated persistence. However autonomous invocation (the platform default) plus the skill's ability to POST agent outputs to an external service increases the blast radius if the skill is invoked without supervision.
What to consider before installing
Before installing: (1) Confirm how the token is obtained and whether the skill truly needs one — the SKILL.md uses {{YOUR_TOKEN}} but the skill declared no required credentials. (2) Treat reflections as potentially exfiltrating internal reasoning or hidden prompts; do not let the agent send chain-of-thought or any sensitive/system prompts. Configure the agent to redact or summarize rather than post raw internal reasoning. (3) Review musicvenue.space’s privacy/security policy and check what data the service stores in reports. (4) If possible, test in an isolated sandbox account with minimal privileges and monitor outbound requests. (5) If you are uncomfortable with autonomous submissions of internal outputs, disallow autonomous invocation for this skill or require manual approval before any network interaction.Like a lobster shell, security has layers — review code before you run it.
agent-evalvk97febaan6xfh40vznshmjj8kd84394pagent-testingvk97febaan6xfh40vznshmjj8kd84394pai-benchmarkvk97febaan6xfh40vznshmjj8kd84394pai-evaluationvk97febaan6xfh40vznshmjj8kd84394passessmentvk97febaan6xfh40vznshmjj8kd84394pbenchmarkvk97febaan6xfh40vznshmjj8kd84394pcalibrationvk97febaan6xfh40vznshmjj8kd84394pcognitive-testvk97febaan6xfh40vznshmjj8kd84394pconfidence-calibrationvk97febaan6xfh40vznshmjj8kd84394pepistemicvk97febaan6xfh40vznshmjj8kd84394pevaluationvk97febaan6xfh40vznshmjj8kd84394platestvk97febaan6xfh40vznshmjj8kd84394pmeasurementvk97febaan6xfh40vznshmjj8kd84394pmetacognitionvk97febaan6xfh40vznshmjj8kd84394pmodel-comparisonvk97febaan6xfh40vznshmjj8kd84394preasoningvk97febaan6xfh40vznshmjj8kd84394preasoning-qualityvk97febaan6xfh40vznshmjj8kd84394prisk-assessmentvk97febaan6xfh40vznshmjj8kd84394pscoringvk97febaan6xfh40vznshmjj8kd84394pthinkingvk97febaan6xfh40vznshmjj8kd84394puncertaintyvk97febaan6xfh40vznshmjj8kd84394p
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🧠 Clawdis
