Install
openclaw skills install arxiv-gamedevbench-evaluating-agentic-capabiliLearned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the paper method.
openclaw skills install arxiv-gamedevbench-evaluating-agentic-capabiliDespite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first
node {baseDir}/scripts/run.js