Context-Inappropriate Capability
Medium
- Confidence
- 97% confidence
- Finding
- The script allows arbitrary local command execution via JUDGEHUMAN_EVAL_CMD, then feeds story content to that command and trusts its stdout as evaluator output. In an agent skill context, this expands behavior far beyond 'evaluate stories' and enables execution of any program available on the host, which becomes dangerous if environment variables or deployment config can be influenced by another party.
