Natural-Language Policy Violations
High
- Confidence
- 98% confidence
- Finding
- The skill explicitly instructs the model to generate scripts using fake supporting participants ('托'), simulated user reactions, and staged testimonials to create the appearance of authentic group consensus. This is deceptive social-engineering content that can mislead users into trusting fabricated endorsements and artificially induced purchase pressure.
