Description-Behavior Mismatch
Medium
- Confidence
- 93% confidence
- Finding
- The prompt explicitly tells the model to generate stylistically similar alternatives for sensitive or copyrighted figures and to not refuse. That weakens normal safety behavior around protected likenesses and copyrighted characters by turning a refusal condition into a generation instruction, which can facilitate policy evasion in a cover-image workflow.
