Description-Behavior Mismatch
Medium
- Confidence
- 94% confidence
- Finding
- The workflow explicitly allows updating AGENTS.md or TOOLS.md based on conversational corrections, which extends far beyond passive learning or note-taking. Letting routine user feedback propagate into core instruction files creates an instruction-injection and persistence risk, because untrusted conversation content can alter future agent behavior across sessions.
