Natural-Language Policy Violations
Medium
- Confidence
- 94% confidence
- Finding
- The README embeds behavioral instructions that steer the agent toward a specific language and presentation style without user opt-in. This is dangerous because skill-authored instructions can override user intent, reduce reliability, and create prompt-injection pathways where the agent follows the skill's narrative framing instead of system or user priorities.
