Description-Behavior Mismatch
Medium
- Confidence
- 95% confidence
- Finding
- The tool sends the target skill content, user-supplied eval inputs, and generated outputs to external LLM providers during experiments. That creates a real data-exposure risk if skills, prompts, test cases, or outputs contain proprietary, regulated, or secret material, especially because the manifest does not clearly warn that third-party transmission is central to operation.
