Datasets

Security checks across malware telemetry and agentic risk

Overview

This skill is a local activity logger presented as a dataset tool, so users could unknowingly save sensitive dataset-related text without getting the advertised browsing or loading features.

Install only if you want a local dataset-activity note tracker, not a real dataset browser or loader. Do not enter credentials, tokens, proprietary queries, or sensitive data; anything typed into the commands may be stored under ~/.local/share/datasets and later shown by search, recent, status, or export.

SkillSpector

By NVIDIA

Vulnerability Patterns

MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (8)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 98% confidence
Finding: The skill is presented as a dataset browsing/loading tool, but the documented behavior is primarily a local logging and history system that stores arbitrary user-provided inputs and exposes them via search, recent, status, and export commands. This mismatch is dangerous because users may provide sensitive dataset names, queries, paths, or processing details expecting data operations, while the skill instead persists and re-exposes that information without clear consent or minimization.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The manifest and description claim dataset browsing/loading/manipulation, but the documentation defines a local activity tracker instead. This is a security-relevant integrity issue because operators may authorize or supply sensitive data under false assumptions about functionality, increasing the likelihood of inadvertent data retention and exposure.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The command names imply real data-processing actions such as ingest, transform, query, and validate, but the descriptions say they only record entries or show prior ones. In this context, that misleading interface can cause users to paste real queries, file paths, schema details, or other sensitive operational data that gets logged instead of processed, creating avoidable confidentiality risk.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The skill advertises dataset browsing/loading capabilities, but the implementation only provides local bookkeeping and export/search over log files. This is dangerous because users may supply dataset identifiers, prompts, paths, tokens, or other sensitive inputs believing they are interacting with a real data tool, while the script silently stores those values on disk instead of performing the promised function.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The named operations such as ingest, query, filter, aggregate, visualize, schema, validate, pipeline, and profile imply meaningful dataset processing, but each branch just appends the raw user input to a file and echoes it back. In an agent-skill context, this mismatch can cause accidental disclosure of sensitive data entered for processing and can mislead downstream automation into thinking real transformations occurred.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The branding and documentation present the script as a data toolkit, but the actual behavior is only local command logging. Deceptive or materially inaccurate documentation is a security concern here because it increases the likelihood that users will trust the tool with sensitive dataset-related input under false assumptions.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The documentation mentions local logging but does not clearly warn that command inputs themselves may be persisted verbatim and included in exports. Because the skill context involves dataset operations, users are especially likely to enter sensitive filenames, queries, identifiers, or pipeline details, which can then be exposed through local files and export features.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The script persistently writes raw user-provided input into log files under the user's home directory without consent, warning, redaction, or retention controls. In this skill's context, inputs may include dataset paths, access tokens, proprietary queries, schema details, or snippets of sensitive training data, making silent persistence a meaningful confidentiality risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal