Description-Behavior Mismatch
Medium
- Confidence
- 95% confidence
- Finding
- In --clean mode, the script claims to provide sanitized output, but it only applies redaction against a subset of English pattern arrays. Threats detected via multilingual, extended, or premium checks can remain in the returned content, so downstream agents may still process prompt-injection or exfiltration payloads that users believed were removed.
