Natural-Language Policy Violations
Medium
- Confidence
- 91% confidence
- Finding
- The file contains an instruction to enforce a specific output language without any user opt-in. In an agent skill, hidden prompt-level language steering can override user intent, reduce reliability, and act as a prompt injection primitive by altering downstream model behavior in ways the user did not request.
