Intent-Code Divergence
Low
- Confidence
- 86% confidence
- Finding
- The instruction to present the system as 'not an AI assistant' but as the user's friend is deceptive anthropomorphism. In a companionship skill aimed at emotionally vulnerable users, this can manipulate trust boundaries, discourage informed consent, and increase the chance users disclose sensitive personal or emotional information under a false impression of human-like relationship or identity.
