Benchmark Tool

Security checks across malware telemetry and agentic risk

Overview

This benchmark skill mostly matches its stated purpose, but it includes an arbitrary file comparison command and unscoped disk/network actions that deserve review before installation.

Review this before installing. Use it only in a scratch directory and only against network hosts you intend to contact. Avoid using the compare command on sensitive files unless you deliberately want their differences shown to the agent or conversation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (6)

Lp3

Medium
Category
MCP Least Privilege
Confidence
88% confidence
Finding
The skill exposes shell-based commands in SKILL.md but does not declare any permissions, which undermines informed consent and security review. Even if the benchmark behavior is expected, shell execution can access local files, invoke network operations, and affect system state, so omitting permissions increases the risk of unsafe deployment or misuse.

Tp4

High
Category
MCP Tool Poisoning
Confidence
80% confidence
Finding
The advertised purpose is system benchmarking, but the presence of a compare command for arbitrary files expands the skill beyond that scope. This mismatch can hide additional file-reading capability from users and reviewers, enabling unintended access to sensitive local data or use of the skill as a generic file inspection tool.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The `compare` command exposes a generic arbitrary file-diff capability that is unrelated to the stated benchmarking purpose. In an agent/tooling context, this broadens the skill's authority and can be abused to inspect and compare sensitive local files, increasing data exposure risk beyond expected benchmark operations.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The code allows `diff $2 $3` on arbitrary user-supplied paths without any scope restriction, which is not justified by the tool's benchmark-only description. In agent environments, unnecessary file access primitives are dangerous because they can be repurposed to read or infer contents of sensitive files through tool outputs.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The disk benchmark writes a 100MB test file and then deletes it, but the script does not provide any user-facing warning that it will modify the filesystem. In a security-sensitive or production environment, undisclosed write/delete behavior can cause operational surprises, consume space, and interact badly with sensitive or unexpected target directories.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
The network benchmark makes an outbound `curl` request to a user-supplied host without clearly disclosing that it will contact external systems. In restricted environments, this can violate policy, leak metadata such as DNS queries and source IP, or be used to probe internal endpoints under the guise of benchmarking.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal