Master Data Matching

Security checks across malware telemetry and agentic risk

Overview

This is a self-contained master-data matching skill that handles sensitive business and HR records, but its behavior is disclosed, local, and aligned with its purpose.

Install only if you are authorized to process the vendor, finance, sales, or HR data you will provide. Keep sensitive identifiers out of logs where possible, mask or minimize review and learning payloads, and define retention and access controls before enabling any persistent active-learning store.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (5)

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The trigger list contains broad phrases such as 'human in the loop' and 'active learning' that are not specific to master-data matching and could cause the skill to activate in unrelated contexts. Because this skill processes sensitive procurement, finance, HR, and OCR-derived identity data, unintended activation increases the chance of inappropriate handling or disclosure of personal or financial information.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly handles employee, vendor, customer, tax, banking, and contact data, but the user-facing documentation does not warn about privacy, sensitivity, or required safeguards. In a skill designed for entity resolution across HR and finance domains, this omission can lead users to provide regulated or high-impact data without minimization, consent checks, or appropriate review.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The HITL review payload includes the full OCR entity, suggested matched record, and detailed field-by-field comparisons, which can expose highly sensitive master data such as tax IDs, bank accounts, employee identifiers, and contact details to downstream callers or logs. In this skill's context, that is more dangerous than usual because it operates across procurement, finance, sales, and HR domains, all of which routinely contain regulated or confidential PII and financial data.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The active-learning flow stores and returns full OCR entities, matched records, corrected data, and rejection context inside learning payloads and aggregated processing, creating a secondary persistence and exposure path for sensitive data. This increases privacy and compliance risk because training/analytics stores are often broader-access than transactional systems, and this skill handles especially sensitive enterprise records including HR identity numbers and finance/procurement banking data.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The architecture explicitly includes HITL and active-learning payloads containing OCR entities, matched records, corrected data, and reasons across HR, finance, procurement, and sales domains, which can include PII and financial identifiers. Even though this file is documentation, specifying these payloads without privacy, minimization, access-control, retention, or redaction guidance increases the risk that implementers will log, store, or transmit sensitive data unsafely.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal