Description-Behavior Mismatch
Medium
- Confidence
- 94% confidence
- Finding
- The skill’s stated purpose is prompt formatting and output validation, but it also stores question/answer data on disk. That hidden persistence expands the data-handling scope beyond user expectations and can expose benchmark prompts, answers, or sensitive inputs to later disclosure through local file access, reuse, or accidental inclusion in artifacts.
