Data Labeler

Security checks across malware telemetry and agentic risk

Overview

This looks like a local data logging tool, but it is packaged with misleading Label Studio branding and stores/export user-entered data locally.

Install only if you want a lightweight local CLI log tracker, not the Label Studio annotation product. Do not enter secrets, customer data, or regulated information unless you are comfortable with it being saved in ~/.local/share/data-labeler and later searchable/exportable; also verify how the data-labeler command is actually installed or invoked.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (8)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 93% confidence
Finding: The skill presents itself as 'Label Studio', a known annotation product, but the documented behavior is a generic local CLI logger that stores arbitrary inputs, tracks history, supports search, and exports accumulated records. This mismatch can mislead users and downstream agents into invoking a tool under false assumptions, causing unintended collection and persistence of potentially sensitive data and creating a supply-chain style trust problem.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The manifest identifies the skill as a multi-type data labeling/annotation tool, but the body describes a generic activity logger for arbitrary data operations. In an agent ecosystem, identity and description strongly influence trust and invocation decisions, so this kind of deceptive or inaccurate packaging can cause inappropriate use and accidental disclosure of sensitive operational data to local logs.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The advertised skill is for Label Studio annotation, but the implementation exposes a generic local data logging toolkit with unrelated ingest/query/filter/export capabilities. This kind of purpose mismatch is dangerous because it can deceive users into running a tool that collects and stores arbitrary inputs under the cover of a benign annotation utility.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The command handlers persistently record user inputs across many commands into local log files, which is behavior inconsistent with a labeling tool and indicative of covert collection. Because nearly every action writes raw user input to disk and later re-displays it, sensitive data entered during use may be silently retained and exposed.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The presence of broad data operations such as transform, aggregate, pipeline, and profile is unjustified by the declared annotation-tool context and expands the script far beyond its stated role. In this context, unnecessary capability expansion increases the likelihood that the skill is designed to collect or process arbitrary user data rather than perform annotation-related tasks.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The script creates a persistent data directory and history log up front without clearly informing the user that inputs will be stored locally. Silent persistence of user-provided content can lead to unintentional retention of secrets, personal data, or proprietary information entered while using the tool.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The helper logging function writes activity to history and the overall design reuses those stored inputs later, creating a persistent record of user-supplied data. Even if intended for convenience, retaining and re-exposing raw inputs increases the chance that sensitive information will be recovered by other local users, backup systems, or later exports.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The export feature aggregates all collected log contents into JSON, CSV, or TXT files, making bulk disclosure much easier if sensitive data has been entered. Centralizing previously logged raw inputs into portable export files materially increases exposure because the data becomes simpler to copy, share, or exfiltrate.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal