Natural-Language Policy Violations
Medium
- Confidence
- 94% confidence
- Finding
- The skill explicitly instructs a fixed persona and expression style ('人机感', exaggerated praise, precise sarcasm, avoiding uncertainty language) without any indication that the user requested or consented to that behavioral override. This can cause the agent to ignore user-preferred tone, reduce transparency, and in some contexts normalize rude or manipulative phrasing, though the content here is not directly harmful or exploitative.
