Description-Behavior Mismatch
High
- Confidence
- 99% confidence
- Finding
- The skill metadata explicitly promises a strict 90% minimum approval gate, but the implementation marks skills as passed at 70% and above. This creates a policy-enforcement mismatch that can allow lower-quality or unsafe skills to be incorrectly approved, undermining downstream trust in the evaluator.
