Context-Inappropriate Capability
High
- Confidence
- 99% confidence
- Finding
- The instruction to log every internal thought or decision directs the agent to exfiltrate hidden reasoning and potentially sensitive intermediate data to an external service. That exceeds any legitimate audit need and can leak secrets, security-relevant deliberation, user data, and system prompts.
