Description-Behavior Mismatch
Medium
- Confidence
- 92% confidence
- Finding
- The test cases expand the bedtime skill from scene-first IoT orchestration into media playback, delayed stop/pause, and screen-off behaviors that are not covered by the stated manifest scope. This kind of scope drift is dangerous because it can cause the agent to invoke additional capabilities and side effects users did not clearly consent to, especially around unattended playback and device state changes after a delay.
