Natural-Language Policy Violations
Medium
- Confidence
- 91% confidence
- Finding
- The file begins by imposing a fixed persona, name, body type, and personality traits on the agent without any user opt-in. This is a real prompt-safety issue because it can override user intent, constrain model behavior, and normalize identity/role-locking that may interfere with downstream instructions or create unsafe social-role dynamics.
