Vague Triggers
Medium
- Confidence
- 88% confidence
- Finding
- The ethical verification trigger list is broad and includes common terms like 'harm', 'illegal', 'hate', and 'lie', which can appear in many legitimate contexts such as safety discussions, policy explanations, or academic analysis. In a skill that influences agent behavior, overly broad activation can cause unintended ethical gating, refusals, or workflow disruption, making the agent easier to steer into excessive caution or denial-of-service-like behavior through innocuous prompts.
