Natural-Language Policy Violations
Medium
- Confidence
- 98% confidence
- Finding
- The example explicitly instructs users to paste a Notion API key into chat, which normalizes insecure secret handling and can lead to credential exposure in logs, transcripts, or downstream systems. In a safety document, this is especially risky because it presents the behavior as acceptable remediation during security-sensitive workflows.
