Bigdata

Security checks across malware telemetry and agentic risk

Overview

This skill is a local plaintext workflow logger with an overstated big-data description, but no hidden network access, credential use, destructive behavior, or autonomous execution was found.

Install only if you want a local plaintext log of data workflow notes. Avoid entering secrets, personal data, proprietary SQL, sensitive schema details, or confidential dataset contents, and do not rely on this artifact for real big-data processing features such as splitting large files or parallel analysis.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (9)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The manifest advertises splitting, parallel processing, and batch analysis, while the documented functionality only records user-supplied entries to local logs. Users may disclose sensitive data or rely on the skill for processing tasks it does not perform, leading to unintended data retention and unsafe operational assumptions.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: Commands such as ingest, transform, query, filter, and aggregate are labeled as if they execute real data operations, but the documentation says they merely log those actions. This can cause users to paste real queries, dataset identifiers, or sensitive workflow details into a persistent plaintext store under the false impression that the tool is processing data rather than archiving inputs.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The help text and advertised functionality describe a bulk-data processing toolkit, but the implementation shown does not provide those capabilities and instead centers on local logging and simple file display/export. This mismatch is dangerous because users may supply sensitive datasets or operational queries under false assumptions about what the tool does, leading to unintended retention and exposure of data.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The core commands only append user input to log files rather than performing ingest, transform, query, filtering, aggregation, or validation. In the context of a skill presented as a big-data utility, this is dangerous because operators may enter proprietary data, credentials, log excerpts, or internal queries that are silently stored and later exposed through other commands.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The script creates a hidden directory under the user's home directory and writes history there without meaningful consent or necessity tied to the stated purpose. Hidden persistent storage increases the chance that sensitive operational inputs remain on disk unnoticed and are later accessible to other local users, processes, backups, or forensic review.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The search, recent, and status features are oriented around revealing previously entered content and activity history, not performing the advertised analytics tasks. This expands the exposure surface by making stored sensitive inputs easy to enumerate and review from the command line.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill description omits an important warning that all provided inputs are persisted in local plaintext log files. In the context of data workflows, users are especially likely to enter sensitive dataset names, SQL queries, schema information, or validation notes, so failing to disclose storage behavior increases the chance of accidental exposure.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: At this location, user-supplied command input is written directly to a persistent log file, with no indication in the workflow that retention is occurring. This is dangerous because users commonly paste sensitive dataset fragments, file paths, tokens, or business data into such commands, and those values will remain on disk in plaintext.

Ssd 3

High

Confidence: 98% confidence
Finding: The design persistently records user inputs and exposes them later through search, recent, status, and export workflows, creating a straightforward data disclosure channel. In a tool framed as data-processing infrastructure, users are especially likely to enter sensitive operational content, making plaintext retention and broad replay capabilities significantly risky.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal