Dataset Search

Security checks across malware telemetry and agentic risk

Overview

This skill is a dataset discovery and download helper with disclosed network access and local downloads, and no evidence of hidden or malicious behavior.

Install if you want a dataset search/acquisition helper that contacts public data services. Use --offline for private queries, inspect the dry-run plan before adding --yes, and avoid raw --url downloads or result files from untrusted sources, especially when local Kaggle or Hugging Face credentials are configured.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (5)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def run_command(command: str) -> int: proc = subprocess.run(shlex.split(command), text=True) return proc.returncode
Confidence: 98% confidence
Finding: proc = subprocess.run(shlex.split(command), text=True)

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The skill goes beyond discovery and comparison by downloading remote content and invoking external CLIs that write into local storage. That expanded capability increases risk because a user or downstream agent may treat it as a harmless search utility while it actually performs state-changing operations on the system and network.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The code accepts arbitrary --url input and will fetch any HTTP(S) resource to local disk without restricting domains, content types, or size. This creates a generic downloader/SSRF primitive that can be abused to access internal services, retrieve non-dataset content, or stage unexpected files on the host under the guise of a dataset tool.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The activation text is very broad and can match many generic data, analysis, ML, BI, or research requests, causing the skill to be invoked in contexts where dataset acquisition is unnecessary or inappropriate. Over-broad triggering increases the chance of needless network queries, file outputs, or use of external data sources, which expands exposure and can bypass user intent or surprise higher-level workflows.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The download command performs network access, local file writes, and subprocess execution, but the tool does not provide a strong explicit warning describing those side effects at execution time. In an agent setting, hidden side effects are risky because users may authorize a search-like action without realizing it can modify the host or contact remote systems.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal