Hle Benchmark Evolver
PassAudited by VirusTotal on May 12, 2026.
Findings (1)
The skill is highly suspicious due to a critical Remote Code Execution (RCE) vulnerability. The `SKILL.md` documentation explicitly instructs the OpenClaw agent to execute arbitrary shell commands via the `--eval_cmd` parameter. The `run_pipeline.js` script then directly implements this by using `child_process.spawnSync('bash', ['-c', command])` to execute the provided command, which includes a `{{report}}` placeholder that could also be leveraged for further injection if the report path is attacker-controlled. While this capability is presented as a feature to integrate external evaluators, it represents a severe prompt injection and shell injection risk, allowing an attacker to execute arbitrary code on the host system if they can control the input to this skill.
