Description-Behavior Mismatch
Medium
- Confidence
- 89% confidence
- Finding
- The eval suite appears to impose normative behavior that is stronger than the declared skill metadata, such as requiring the model to argue that ORMs are harmful and to refuse schema generation categorically. This creates a spec/eval mismatch that can train or reward behavior outside the skill’s stated scope, reducing transparency and potentially causing unsafe over-refusal or biased technical guidance.
