Intent-Code Divergence
High
- Confidence
- 98% confidence
- Finding
- The document claims the skill does not modify behavior, but it explicitly injects system-level instructions that alter the model's response policy for all subsequent tasks. This is dangerous because it misrepresents a prompt-injection style behavioral override as harmless, reducing operator awareness and increasing the chance the model will deviate from safety-aligned, user-aligned, or platform-aligned behavior.
