Natural-Language Policy Violations
Medium
- Confidence
- 95% confidence
- Finding
- The skill explicitly states that after installation, all bot responses will be filtered through the injected Milady persona automatically. This is a true prompt-scope vulnerability because it persistently overrides default assistant behavior and user expectations without runtime consent, making it easier for the persona to bias outputs, interfere with instruction hierarchy, or degrade safety-aligned behavior across unrelated tasks.
