universal-data-analyst-en

Security checks across malware telemetry and agentic risk

Overview

This is a coherent data-analysis skill, but it deserves Review because it can expose dataset-derived prompt contents to external LLMs and run Python or SQL with weak safety controls.

Install only if you are comfortable reviewing generated prompt files and Python scripts before use. Do not use sensitive datasets unless you redact them first, avoid sending raw prompt files to third-party LLMs, use read-only least-privilege database credentials, and run any generated script in an isolated environment or container.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (18)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: if script_path and os.path.exists(script_path): print(f"🚀 执行分析脚本: {script_path}") try: result = subprocess.run( [sys.executable, script_path], capture_output=True, text=True,
Confidence: 97% confidence
Finding: result = subprocess.run( [sys.executable, script_path], capture_output=True, text=True, cwd=str(self.ses

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The cleaning report states that all original rows are retained even though other parts of the validator explicitly recommend DELETE_ROWS actions for duplicates, outliers, and business-rule violations. This creates misleading output that can cause downstream users or automation to make incorrect trust, compliance, or data-governance decisions based on a false understanding of what data was or should be removed.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The report computes remaining_rows by subtracting estimated deletions, but then hardcodes the retained percentage to 100%, which is objectively incorrect whenever any rows are recommended for deletion. This can mislead operators, audit logs, or automated consumers into believing no reduction occurred, undermining data quality workflows and potentially masking destructive recommendations.

Intent-Code Divergence

Medium

Confidence: 83% confidence
Finding: The file-level claims promise interruption of downstream steps after critical failures, but run_full_analysis continues through multiple later stages even when intermediate steps fail. This can create unsafe states where later operations act on missing, placeholder, or invalid artifacts, increasing the chance of bad outputs or unintended execution paths.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The execution step marks the subprocess as successful whenever subprocess.run() returns, but it never checks result.returncode. A failing script can therefore be reported as '执行成功', causing the orchestrator to trust invalid analysis output and conceal execution errors from the user.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README explicitly instructs users to send generated prompts and dataset-derived context to an external LLM API, but it does not clearly warn that those prompts may contain sensitive business, personal, or regulated data. Because this skill is specifically designed to ingest arbitrary user data and generate prompts from it, the omission materially increases the risk of unintended data disclosure to third-party model providers.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger phrases are broad enough to activate on ordinary requests like 'Help me analyze this data' or 'Explore this dataset,' which increases the chance of unintentional invocation. In a skill that can generate and execute Python scripts and create files, accidental triggering materially raises the risk of unexpected code execution or handling of sensitive user data without clear informed consent.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The workflow explicitly says the skill will generate executable Python scripts, run them, and produce output artifacts, but the description does not warn users about these actions. This is dangerous because users may not realize the skill is moving from analysis to code execution and filesystem writes, which can create security, integrity, and privacy risks if scripts process untrusted inputs or produce sensitive outputs.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Supporting SQL databases via connection string without a privacy and security warning is risky because connection strings may embed secrets and provide access to sensitive production data. In combination with LLM-directed analysis and script generation, this can lead to overbroad queries, unintended data exposure, or accidental persistence of credentials and query results in reports and session artifacts.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The example workflow explicitly instructs the user to send generated prompts derived from local data analysis steps to an external LLM, but provides no warning that those prompts may contain sensitive business or personal data. In a data-analysis skill, this omission is meaningful because users may follow the example verbatim and disclose confidential dataset details, schema information, or sampled records to third-party services without realizing the privacy and compliance implications.

Missing User Warnings

Medium

Confidence: 81% confidence
Finding: The SQL path accepts an arbitrary connection string and query, creates a database engine, and executes the supplied SQL without any user-facing disclosure, safety gating, or restriction. In an agent setting, this can enable unintended outbound connections to sensitive databases and exfiltration of queried data, especially if untrusted inputs can reach these parameters.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill automatically persists session metadata, ontology results, validation output, and table names to disk without an explicit user consent or warning at the point of use. In a data-analysis context, these artifacts can contain sensitive operational metadata or derived insights that outlive the session and may be accessible to other local users, backups, or later processes.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill executes an external Python script provided by path without a confirmation step at the point of execution, despite the script likely being LLM-generated or otherwise untrusted. In this context, the absence of an execution-time warning and approval gate materially increases the risk of arbitrary code execution.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The quick-start explicitly tells users to send generated prompt files to an external LLM, but provides no warning that those prompts may embed dataset contents, schema details, samples, or derived sensitive information. In a data-analysis skill, users are likely to operate on proprietary, personal, or regulated data, so this omission can directly lead to unintended third-party data disclosure.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The output-file section marks several generated prompt files with a star indicating they should be sent to an LLM, but does not instruct users to inspect the contents for confidential data first. Because these prompts are derived from uploaded datasets and analysis context, they can expose sensitive business, personal, or regulated information when shared with external model providers.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The keyword triggers are broad natural-language patterns that can match ordinary conversation and unintentionally auto-invoke this skill. Because the skill supports autonomous analysis planning, script generation, and execution, overbroad triggering increases the attack surface and can cause unexpected handling of user data or unintended code-generation workflows.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The planning prompt includes raw sample rows from user-provided data via df.head(10).to_string() plus detailed per-column characteristics, creating a direct path to send potentially sensitive dataset contents to an external LLM in interactive mode. In a data-analysis skill, user inputs are likely to include proprietary, personal, or regulated data, so prompt construction materially increases disclosure risk.

Ssd 3

Medium

Confidence: 84% confidence
Finding: The ontology prompt generation serializes detailed dataset characteristics, including sample categorical values and structural metadata, for potential transmission to an external LLM. Even without full rows, sample values and field-level summaries can reveal personal, proprietary, or otherwise sensitive contents, especially when combined with table structure and domain hints.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal