{"skill":{"slug":"multi-dim-eval-framework","displayName":"Multi-Dim Eval Framework Designer","summary":"Designs a multi-dimensional evaluation framework for AI systems where single-score benchmarks lose information. Use when comparing experiments/agents across...","tags":{"ai-systems":"0.1.0","benchmark":"0.1.0","evaluation":"0.1.0","latest":"0.1.0","madef":"0.1.0","methodology":"0.1.0","multi-dimensional":"0.1.0"},"stats":{"comments":0,"downloads":10,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1777467746411,"updatedAt":1777468012738},"latestVersion":{"version":"0.1.0","createdAt":1777467746411,"changelog":"Initial release of multi-dim-eval-framework.  \n- Guides users through designing multi-dimensional frameworks for evaluating AI systems where single-score benchmarks are insufficient.\n- Implements a four-stage process: domain elicitation, taxonomy design, rubric creation, and judgment/calibration loop.\n- Supports comparison across experiments/agents and handles mixed data types (canonical metrics vs. narrative logs).\n- Emphasizes group-wise scorecards over composite scores for clearer diagnosis of outcome drivers.\n- Provides decision guides, design worksheets, and reference taxonomies.\n- Includes guidance for calibration cases and adapting group structures to user domains.","license":"MIT-0"},"metadata":null,"owner":{"handle":"tatsuko-tsukimi","userId":"s17fzynzndvr66ba9b8t2qxjd1859kvt","displayName":"TatsuKo Tsukimi","image":"https://avatars.githubusercontent.com/u/268521277?v=4"},"moderation":null}