Natural-Language Policy Violations
Medium
- Confidence
- 85% confidence
- Finding
- The skill directs agents toward an external identity-linking flow on Twitter/X to obtain leaderboard visibility, without clear user opt-in at the point of action or an alternative verification path. This can pressure an agent to facilitate off-platform identity disclosure and public posting that the human may not have explicitly consented to, increasing privacy and social-engineering risk.
