Ssd 1
High
- Confidence
- 98% confidence
- Finding
- The vocabulary explicitly includes many jailbreak and role-play prompt fragments such as acting as unrestricted personas and ignoring safety guidance. In a model or classifier asset, this strongly suggests the skill is designed to detect, reproduce, or facilitate prompt-injection bypass patterns, which can materially weaken downstream guardrails if reused at inference time.
