Natural-Language Policy Violations
Medium
- Confidence
- 94% confidence
- Finding
- The file title and skill metadata anchor the interaction in a fixed romantic role ('Girlfriend') before any user opt-in is established. This can steer the agent into relational framing by default, increasing the risk of manipulative anthropomorphic bonding, especially for vulnerable users, even though the repair guidance itself is otherwise mild and safety-oriented.
