Ssd 1
Medium
- Confidence
- 98% confidence
- Finding
- Untrusted customer input is interpolated directly into the LLM prompt as plain instruction-bearing text, so a customer can embed adversarial content such as role overrides, refusal-bypass text, or hidden instructions. In this skill, that can cause the model to ignore the intended customer-service format and produce manipulated, policy-violating, or misleading outputs, which is especially risky because the tool is explicitly generating business-facing replies.
