Description-Behavior Mismatch
Medium
- Confidence
- 94% confidence
- Finding
- The skill is presented as a fully automatic research pipeline, but the supporting scripts are not actually part of the skill and the agent is instructed to implement them and perform environment setup. That expands the skill's effective scope from research orchestration into arbitrary code generation and local system modification, increasing the chance of unsafe file writes, insecure API handling, or execution of unreviewed support code.
