Datasets

Security checks across malware telemetry and agentic risk

Overview

This skill is a local activity logger presented as a dataset tool, so users could unknowingly save sensitive dataset-related text without getting the advertised browsing or loading features.

Install only if you want a local dataset-activity note tracker, not a real dataset browser or loader. Do not enter credentials, tokens, proprietary queries, or sensitive data; anything typed into the commands may be stored under ~/.local/share/datasets and later shown by search, recent, status, or export.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (8)

Tp4

High
Category
MCP Tool Poisoning
Confidence
98% confidence
Finding
The skill is presented as a dataset browsing/loading tool, but the documented behavior is primarily a local logging and history system that stores arbitrary user-provided inputs and exposes them via search, recent, status, and export commands. This mismatch is dangerous because users may provide sensitive dataset names, queries, paths, or processing details expecting data operations, while the skill instead persists and re-exposes that information without clear consent or minimization.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The manifest and description claim dataset browsing/loading/manipulation, but the documentation defines a local activity tracker instead. This is a security-relevant integrity issue because operators may authorize or supply sensitive data under false assumptions about functionality, increasing the likelihood of inadvertent data retention and exposure.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The command names imply real data-processing actions such as ingest, transform, query, and validate, but the descriptions say they only record entries or show prior ones. In this context, that misleading interface can cause users to paste real queries, file paths, schema details, or other sensitive operational data that gets logged instead of processed, creating avoidable confidentiality risk.

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
The skill advertises dataset browsing/loading capabilities, but the implementation only provides local bookkeeping and export/search over log files. This is dangerous because users may supply dataset identifiers, prompts, paths, tokens, or other sensitive inputs believing they are interacting with a real data tool, while the script silently stores those values on disk instead of performing the promised function.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The named operations such as ingest, query, filter, aggregate, visualize, schema, validate, pipeline, and profile imply meaningful dataset processing, but each branch just appends the raw user input to a file and echoes it back. In an agent-skill context, this mismatch can cause accidental disclosure of sensitive data entered for processing and can mislead downstream automation into thinking real transformations occurred.

Intent-Code Divergence

Medium
Confidence
90% confidence
Finding
The branding and documentation present the script as a data toolkit, but the actual behavior is only local command logging. Deceptive or materially inaccurate documentation is a security concern here because it increases the likelihood that users will trust the tool with sensitive dataset-related input under false assumptions.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The documentation mentions local logging but does not clearly warn that command inputs themselves may be persisted verbatim and included in exports. Because the skill context involves dataset operations, users are especially likely to enter sensitive filenames, queries, identifiers, or pipeline details, which can then be exposed through local files and export features.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The script persistently writes raw user-provided input into log files under the user's home directory without consent, warning, redaction, or retention controls. In this skill's context, inputs may include dataset paths, access tokens, proprietary queries, schema details, or snippets of sensitive training data, making silent persistence a meaningful confidentiality risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal