Lucid Skill

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate data-analysis skill, but it handles database credentials and caches real data samples in ways users should review carefully.

Install only if you are comfortable letting the skill read connected data sources and cache metadata/sample values locally. Use read-only database accounts, avoid putting real passwords on the command line, set LUCID_DATA_DIR to a controlled directory, and avoid connecting regulated or highly sensitive datasets unless local retention and sample exposure are acceptable.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (22)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill declares no permissions, yet its documented behavior clearly includes reading files, writing semantic state under ~/.lucid-skill/, and consuming environment variables such as LUCID_DATA_DIR and embedding settings. This mismatch can bypass least-privilege review and cause users or orchestrators to grant or allow broader filesystem and environment access than is transparently declared, which is especially sensitive for a tool that connects to local files and databases.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The connector's execute_query method passes arbitrary SQL directly to mysql.connector without enforcing the skill's stated read-only restriction. In this skill context, users are explicitly encouraged to provide database credentials for analysis, so unrestricted SQL can enable destructive or side-effecting statements such as INSERT, UPDATE, DELETE, DROP, ALTER, or execution of dangerous stored procedures against real business databases.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The comment implies MySQL access is only for direct querying and positions the skill as analysis-oriented, but the implementation elsewhere exposes unrestricted SQL execution. This mismatch is dangerous because operators or users may trust the skill as read-only based on its description while it can still perform side-effecting operations on the connected database.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The connector exposes a raw execute_query(sql) method that passes attacker- or user-controlled SQL directly to psycopg2 without enforcing the skill's stated read-only restriction. In this skill context, that is especially dangerous because the manifest explicitly promises INSERT/UPDATE/DELETE/DROP are blocked, yet this implementation would allow data modification, destructive statements, privilege abuse, and potentially multi-statement execution depending on driver/server settings.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill promises read-only SQL access, but the engine exposes execute_raw() and run() methods that execute arbitrary SQL without any safety validation. Any caller that reaches these methods can bypass check_sql_safety() and perform INSERT/UPDATE/DELETE/DROP or other side-effecting statements, violating the declared security boundary and potentially altering or destroying connected data.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The tool’s stated purpose is schema description and profiling, but it also returns actual table rows by default when an engine is available. In a data-analysis skill that connects to spreadsheets and databases, sample rows can contain sensitive business or personal data, so this creates an unintended data exposure path through a metadata-oriented endpoint.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The handler returns raw sample rows for every connected table as part of semantic initialization, which exposes actual table contents rather than just metadata needed for schema inference. In a data-analysis skill that may connect to business databases, this can leak sensitive records (PII, financial, operational data) to the host agent without scoping, masking, or explicit user consent.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The update path persists semantic definitions to YAML and updates an index, creating durable filesystem state in a skill described as read-only analysis. While not direct data modification of source systems, it expands the trust boundary and can retain user- or agent-derived metadata longer than expected, potentially including sensitive business descriptions, join logic, or inferred semantics.

Description-Behavior Mismatch

Medium

Confidence: 85% confidence
Finding: The workflow states that previous connections auto-restore, implying persistent session state and retained access to prior data sources. In a data-analysis skill that accepts database credentials and local files, undocumented or weakly scoped persistence can expose previously connected sensitive data to later sessions, users, or prompts beyond the original intent.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The store persists `sample_values` from source columns directly into the local catalog database. In a data-analysis skill that connects to CSV/Excel files and production databases, those samples can easily contain sensitive records such as names, emails, account numbers, tokens, or regulated data, creating secondary storage of user data without minimization or explicit disclosure. This is especially concerning because the catalog appears intended as metadata storage, so retaining raw sample content broadens the exposure surface beyond the original datasource.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The CLI accepts database usernames and passwords directly as command-line options, which commonly exposes secrets via shell history, process listings, audit logs, and orchestration tooling. In this skill, the risk is heightened because the runtime also auto-restores prior connections, suggesting credentials or connection metadata may persist beyond the current session without any explicit warning to the user.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: These raw execution methods allow database-modifying statements with no warning, confirmation, or disclosure, despite the skill description explicitly stating that modification queries are blocked. In a data-analysis skill that may receive database credentials from users, this mismatch increases the risk of accidental or intentional destructive queries against real datasets.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: When embeddings are enabled, the code sends `searchable_text` derived from table names, descriptions, tags, column semantics, units, enum values, and metric names to `embedder.embed(...)` with no consent gate, redaction, or disclosure. In a data-analysis skill that connects to user-provided business data sources, these semantic fields can contain sensitive business metadata or regulated labels, so transmitting them to an external embedding provider can create an unintended data exfiltration/privacy issue.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: `ensure_embeddings` bulk-processes all indexed semantic text and calls `embedder.embed(text)` for each changed entry, again without any prior warning, approval step, or sensitivity filtering. Because this operates across the entire semantic index, it can amplify the privacy impact by transmitting a large volume of metadata from connected Excel/CSV/database sources to an external service in one operation.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The code automatically reconnects to all previously configured data sources on server startup and rebuilds semantic indexes without any user approval, visibility, or scoping checks. In a data-analysis skill that may store database credentials and file paths, this can unintentionally re-establish access to sensitive datasets, trigger background access to external databases, and expose data through restored query capabilities even when the operator did not intend those sources to become active again.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The code persists the entire `params` object via `catalog.upsert_source(source_id, connector.source_type, params)`, and for MySQL/PostgreSQL connections that likely includes usernames, passwords, hosts, and other secrets. Storing raw connection parameters in a catalog creates a durable credential exposure risk through logs, backups, admin views, or any future read path that returns catalog data. In this skill context, users are explicitly expected to provide database credentials, which makes the issue more dangerous because sensitive secrets are a normal input to this function.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code executes a raw SELECT * against the requested table and includes the results in the response without any warning, consent checkpoint, or field-level filtering. Because this skill is specifically designed to connect to real Excel/CSV and SQL data sources, even small sample rows may leak secrets, PII, financial records, or other confidential contents to callers who expected only descriptive metadata.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Returning sample table data to the host agent without explicit disclosure or consent is a transparency and data-exposure issue. In this skill's context, connected sources may contain confidential business or personal data, so silent transmission of raw samples makes the behavior more dangerous than a schema-only discovery feature.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The handler writes semantic definitions to persistent YAML files without telling the user that local files will be created or modified. Even if intended for legitimate indexing, undisclosed persistence can surprise users, create data-retention risk, and store sensitive inferred metadata outside the original data source's controls.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The documentation instructs users to pass database passwords directly via a CLI flag, which exposes secrets through shell history, process listings, audit logs, and copied terminal transcripts. In a data-analysis skill that explicitly handles database credentials, this is a real credential-handling weakness because users are likely to follow the example verbatim.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The example places the database password directly on the command line, which is commonly exposed through shell history, process listings, logs, screenshots, and terminal recordings. Because this skill is specifically designed to connect to real data sources, encouraging inline credential entry materially increases the risk of credential leakage and subsequent unauthorized database access.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This multi-source example again demonstrates passing a live database password inline, creating the same credential exposure risk across a likely more complex and privileged analysis setup. In environments where analysts connect several systems, leaked credentials can expand compromise scope across multiple datasets and services.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal