Security audit

Ab Test Agent Workflow 1.1.0

Security checks across malware telemetry and agentic risk

Overview

This is a coherent A/B testing workflow skill, with a notable blind-test integrity flaw but no evidence of malicious behavior or hidden data access.

Install only if you want a multi-agent A/B evaluation workflow. Do not use it with confidential prompts unless you are comfortable sharing them with contestant and judge subagents, and avoid exposing anonymizer CLI output to judges because it can reveal which contestant produced each solution.

SkillSpector

By NVIDIA

Vulnerability Patterns

MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (3)

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: This is a true security/design flaw for a double-blind testing tool: the CLI immediately prints which contestant corresponds to 方案1/方案2, defeating anonymization for anyone viewing stdout. In this skill's context, preserving blindness is the core control against evaluator bias, so exposing the mapping at generation time undermines the workflow's integrity and can bias or invalidate results.

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: This is also a true vulnerability because the tool prints the full mapping record immediately after anonymization even though the code comments say it is intended for the final report. That contradiction creates an information disclosure path that collapses the double-blind protocol and makes the testing process manipulable or untrustworthy.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill instructs the coordinator to forward user prompts and generated outputs to multiple subagents and a judge, but it provides no user-facing consent, warning, or data-handling boundary. This can expose sensitive user data, proprietary prompts, or confidential outputs to additional processing contexts beyond what the user reasonably expects, increasing privacy and leakage risk.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.