Dataset Search

Security checks across malware telemetry and agentic risk

Overview

This skill is a dataset discovery and download helper with disclosed network access and local downloads, and no evidence of hidden or malicious behavior.

Install if you want a dataset search/acquisition helper that contacts public data services. Use --offline for private queries, inspect the dry-run plan before adding --yes, and avoid raw --url downloads or result files from untrusted sources, especially when local Kaggle or Hugging Face credentials are configured.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (5)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def run_command(command: str) -> int:
    proc = subprocess.run(shlex.split(command), text=True)
    return proc.returncode
Confidence
98% confidence
Finding
proc = subprocess.run(shlex.split(command), text=True)

Description-Behavior Mismatch

Medium
Confidence
97% confidence
Finding
The skill goes beyond discovery and comparison by downloading remote content and invoking external CLIs that write into local storage. That expanded capability increases risk because a user or downstream agent may treat it as a harmless search utility while it actually performs state-changing operations on the system and network.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The code accepts arbitrary --url input and will fetch any HTTP(S) resource to local disk without restricting domains, content types, or size. This creates a generic downloader/SSRF primitive that can be abused to access internal services, retrieve non-dataset content, or stage unexpected files on the host under the guise of a dataset tool.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The activation text is very broad and can match many generic data, analysis, ML, BI, or research requests, causing the skill to be invoked in contexts where dataset acquisition is unnecessary or inappropriate. Over-broad triggering increases the chance of needless network queries, file outputs, or use of external data sources, which expands exposure and can bypass user intent or surprise higher-level workflows.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The download command performs network access, local file writes, and subprocess execution, but the tool does not provide a strong explicit warning describing those side effects at execution time. In an agent setting, hidden side effects are risky because users may authorize a search-like action without realizing it can modify the host or contact remote systems.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal