Context-Inappropriate Capability
Medium
- Confidence
- 89% confidence
- Finding
- The skill instructs saving evaluation inputs/results into persistent memory files even though its stated purpose is transient quality evaluation. Those records can contain sensitive user prompts, model answers, and retrieved context chunks, creating unnecessary data retention and increasing exposure if memory files are later accessed, indexed, or exfiltrated. The RAG context may include proprietary documents, so persistence makes this more dangerous than a generic metrics tool.
