Natural-Language Policy Violations
Medium
- Confidence
- 90% confidence
- Finding
- The skill unconditionally forces output in Chinese regardless of the user's language preference. This can override user intent, reduce transparency of the model's reasoning to the user, and create opportunities for confusion or misdelivery in multilingual settings, though it is not directly enabling code execution or data exfiltration.
