Natural-Language Policy Violations
Medium
- Confidence
- 95% confidence
- Finding
- The skill hard-codes a policy to always respond in the user's language, which overrides explicit user preference and performs behavioral steering without opt-in or justification. While not directly enabling code execution or data exfiltration, it can cause unwanted locale handling and reduce user control over how the agent responds.
