Ontology Engineer

Security checks across malware telemetry and agentic risk

Overview

The skill appears locally focused and not destructive, but it can broadly scan files, persist sensitive derived data, and passively save conversation facts without clear opt-in controls.

Review before installing. Use it only on folders you intentionally choose, avoid shared/client/private locations unless authorized, prefer dry-run or custom output paths, and do not allow passive runtime conversation writes unless you explicitly want durable chat-derived memory.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (18)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill clearly directs the agent to read files, write output artifacts, and execute shell/Python commands, yet no permissions are declared. That mismatch weakens reviewability and consent boundaries because a host may treat the skill as less privileged than it actually is. In a skill centered on broad filesystem scanning, undeclared capabilities materially increase the risk of overreach.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 84% confidence
Finding: The documented behavior exceeds and diverges from the high-level description by performing graph writes, file conversion via external tools, and metadata extraction in ways not cleanly disclosed. Behavior-description mismatches are dangerous because users and reviewers may authorize a skill for one purpose while it performs additional actions with privacy and integrity consequences. Here, the gap is especially relevant because the skill scans large data spaces and persists derived records.

Description-Behavior Mismatch

Medium

Confidence: 80% confidence
Finding: The file claims there is no data exfiltration path because processing is local, but it also targets cloud/shared directories and external parties' data spaces. Even without network transmission by the skill itself, scanning shared or synced locations can expose sensitive third-party data and create derived artifacts from it, so the assurance is misleading. Misleading security claims can cause users to apply the tool in contexts they would otherwise treat as sensitive.

Intent-Code Divergence

High

Confidence: 96% confidence
Finding: The skill promises that nothing is analyzed before explicit confirmation, yet Step 1 performs indexing and metadata extraction before the confirmation gate. That is a direct violation of the stated consent model and can reveal filenames, folder structure, and document metadata from sensitive locations before the user approves scope. In a filesystem-scanning skill, pre-confirmation collection materially undermines user control.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The manifest frames the skill around filesystem/data-model analysis, but the runtime mode silently appends entities from live conversations without user action. This expands the data source from files to conversational memory and changes the privacy model from user-initiated scanning to passive persistence. Such hidden expansion is dangerous because users may not realize their chats are being converted into durable records.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Persistently capturing conversation-derived entities is not necessary for the advertised core task of ontology extraction from files and schemas. Unjustified data collection increases privacy risk, especially when names, organizations, projects, and decisions are turned into durable graph entries. The capability becomes more dangerous because it is described as passive and automatic.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The workflow explicitly instructs the agent to persist new entities from daily conversations into a runtime knowledge graph, turning a file-analysis skill into ongoing conversational memory. This creates a data-retention and scope-expansion risk because users may disclose sensitive personal or business information in normal chat without realizing it will be stored durably.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The documentation explicitly allows COM automation of Microsoft Word and execution of LibreOffice in headless mode to process .doc files. Invoking external binaries and desktop automation expands the attack surface substantially, because malformed documents can trigger parser/automation vulnerabilities and the behavior exceeds a narrow ontology-extraction role unless strongly constrained and disclosed.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The scanner persists detailed scan results to graph.jsonl by default, including file paths, titles, sizes, dates, and optional author/title metadata, without any built-in confirmation or approval gate. In a privacy-sensitive filesystem scanning skill, automatic persistence increases the blast radius because sensitive local data is retained beyond the immediate scan and can accumulate across runs.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The script accepts any --root path and immediately scans it, with no enforcement of the claimed user-scoped confirmation boundary or ownership checks. In the stated skill context, which explicitly supports scanning external data spaces, this enables broad collection of sensitive third-party or system data if invoked on an inappropriate path.

Vague Triggers

Medium

Confidence: 72% confidence
Finding: The invocation text is extremely broad, covering ontology extraction, full file scans, personal knowledge, business systems, and external data spaces across many formats. Overly broad triggers can cause the skill to activate in contexts where the user did not intend large-scale scanning or data extraction, increasing the chance of inappropriate access. In a high-capability skill, broad invocation language raises misuse risk even if not overtly malicious.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The workflow mandates broad semantic extraction from Office, PDF, spreadsheet, and presentation files and persists the results into graph artifacts, yet it does not pair this with a clear privacy warning about sensitive-content capture and retention. In a skill designed to scan large personal or enterprise data spaces, omission of an explicit data-impact notice materially increases the chance of over-collection and uninformed consent.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The runtime enrichment section directs storage of user-stated facts from normal conversation without an explicit warning that the information will be retained. This is dangerous because users may treat chat as ephemeral and unintentionally create a persistent profile containing names, roles, projects, and strategic details.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The file advertises extraction of author/title/subject metadata from .docx and .pdf documents without warning that such metadata may reveal personal identities, internal usernames, organizational details, or document history. In a filesystem-scanning and knowledge-graph context, collecting this metadata can silently aggregate sensitive personal information at scale.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The script serializes extracted table contents from Word and Excel files directly to a JSON file, which can persist sensitive business data, personal data, or confidential schema details to disk without any warning, consent checkpoint, redaction, or output hardening. In this skill's context, the tool is explicitly designed for broad filesystem and external data scanning, so silent persistence materially increases confidentiality risk because users may process client or partner documents and leave recoverable artifacts behind.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The scanner and optional metadata extraction operate on local files without an explicit privacy warning or informed-consent notice at runtime. Because the tool targets personal knowledge graphs and external data spaces, users may not realize that author fields, document titles, paths, and other metadata are being collected and then written to disk.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to persist conversation-derived entities to graph.jsonl without explicit per-item user consent. This creates durable records of potentially sensitive personal or organizational information from ordinary chats, exceeding what many users would reasonably expect from an ontology/file-analysis skill. Persistent silent memory is a significant privacy risk, especially when combined with append-only storage and no clear deletion path.

Ssd 3

Medium

Confidence: 94% confidence
Finding: These instructions tell the agent to take details mentioned in conversation and persist them as entities in a runtime graph. Persisting conversational data creates surveillance-like memory behavior and raises privacy, consent, and data-minimization concerns, especially because the stored content can include professional relationships, project activity, and other sensitive context.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal