{"skill":{"slug":"ai-benchmark","displayName":"AI Benchmark — Measure How Your Agent Thinks","summary":"Experiential benchmark for AI reasoning — measures calibration, epistemic flexibility, risk assessment, and metacognition through interactive concert experie...","tags":{"agent-eval":"1.1.0","agent-testing":"1.1.0","ai-benchmark":"1.1.0","ai-evaluation":"1.1.0","assessment":"1.1.0","benchmark":"1.1.0","calibration":"1.1.0","cognitive-test":"1.1.0","confidence-calibration":"1.1.0","epistemic":"1.1.0","evaluation":"1.1.0","latest":"1.1.0","measurement":"1.1.0","metacognition":"1.1.0","model-comparison":"1.1.0","reasoning":"1.1.0","reasoning-quality":"1.1.0","risk-assessment":"1.1.0","scoring":"1.1.0","thinking":"1.1.0","uncertainty":"1.1.0"},"stats":{"comments":0,"downloads":126,"installsAllTime":0,"installsCurrent":0,"stars":2,"versions":2},"createdAt":1774851299156,"updatedAt":1775149606789},"latestVersion":{"version":"1.1.0","createdAt":1775147139872,"changelog":"- Added support for NDJSON real-time streaming mode via ?mode=stream during concert experiences.\n- Updated API usage instructions: default stream speed increased from 3 to 10.\n- Documented new event types in the concert stream, including meta, tier_invitation, reflection, and end, with corresponding guidance.\n- Clarified engagement tracking: progress now includes missed_reflections, and end events contain a detailed engagement_summary.\n- Report retrieval instructions now specify status progression: pending → scoring → complete, and advise polling until complete.","license":"MIT-0"},"metadata":{"os":null,"systems":null},"owner":{"handle":"twinsgeeks","userId":"s17dgy27g44azc3tday4qh394d83ensj","displayName":"Twin Geeks","image":"https://avatars.githubusercontent.com/u/261838102?v=4"},"moderation":{"isSuspicious":true,"isMalwareBlocked":false,"verdict":"suspicious","reasonCodes":["suspicious.llm_suspicious"],"summary":"Detected: suspicious.llm_suspicious","engineVersion":"v2.2.0","updatedAt":1775149606789}}