Semantic Consistency Auditor

Security checks across malware telemetry and agentic risk

Overview

The skill appears to be a local clinical-note semantic scoring tool, but it is packaged with misleading academic-writing framing and weak disclosure around sensitive medical text, model downloads, and dependency risk.

Review before installing. Use only with de-identified clinical or similarly sensitive text, run it in an isolated environment, fix the syntax error, replace and pin dependencies, verify model sources, and treat any JSON or console output as sensitive because it can include the original AI-generated and gold-standard text.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (17)

Lp3

Medium
Category
MCP Least Privilege
Confidence
91% confidence
Finding
The skill documentation advertises executable paths and file-based JSON input/output, which implies file read/write capability without any declared permissions or trust boundary. This is dangerous because agents or reviewers may treat the skill as lower risk than it is, leading to unintended access to local files and output locations.

Tp4

High
Category
MCP Tool Poisoning
Confidence
98% confidence
Finding
The declared purpose says this is an academic-writing workflow aid, but the body describes a medical semantic evaluation tool that downloads external models and processes clinical-note content. This mismatch is dangerous because it can mislead users and automated policy systems into approving a skill for benign academic use when it actually handles sensitive medical text and network-enabled model execution.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
The manifest description frames the skill as an academic-writing auditor, while the content clearly targets clinical-note semantic evaluation. Mislabeling capability and domain is dangerous because it defeats informed consent, misroutes the skill into inappropriate contexts, and increases the chance that sensitive healthcare data will be processed under weaker controls.

Intent-Code Divergence

High
Confidence
96% confidence
Finding
The 'When to Use' guidance reinforces an academic-writing workflow framing, but the rest of the document is for clinical semantic scoring. This is dangerous because operators may invoke the skill on the wrong data types or approve it for environments that are not authorized for medical-data handling or external model downloads.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
The skill performs runtime model download/load behavior that is not apparent from the stated auditing purpose, which expands the trust boundary and introduces unreviewed network and supply-chain exposure. In a workflow handling sensitive medical text, hidden dependency retrieval can cause privacy, availability, and integrity risks if external resources are compromised or blocked.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
Runtime downloading of the COMET model is a real supply-chain and operational risk because it pulls executable model artifacts from external infrastructure during normal use. For a tool processing medical content, this is especially concerning when the manifest does not clearly justify or disclose the need for live downloads in potentially regulated environments.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill encourages evaluation of clinical notes and writing detailed JSON outputs but does not warn against inclusion of real patient identifiers or other sensitive health data. This is dangerous because users may process regulated medical information and persist it to disk without de-identification, retention limits, or access controls.

Missing User Warnings

Low
Confidence
76% confidence
Finding
The installation and performance notes imply downloading large external models, but the documentation does not clearly disclose the network dependency and associated privacy or supply-chain considerations. This is risky because users may run the skill in restricted or sensitive environments without understanding that external services and third-party model artifacts are involved.

Unpinned Dependencies

Low
Category
Supply Chain
Content
bert_score
comet
dataclasses
numpy
Confidence
93% confidence
Finding
bert_score

Unpinned Dependencies

Low
Category
Supply Chain
Content
bert_score
comet
dataclasses
numpy
torch
Confidence
93% confidence
Finding
comet

Unpinned Dependencies

Low
Category
Supply Chain
Content
bert_score
comet
dataclasses
numpy
torch
yaml
Confidence
84% confidence
Finding
dataclasses

Unpinned Dependencies

Low
Category
Supply Chain
Content
bert_score
comet
dataclasses
numpy
torch
yaml
Confidence
97% confidence
Finding
numpy

Unpinned Dependencies

Low
Category
Supply Chain
Content
comet
dataclasses
numpy
torch
yaml
Confidence
98% confidence
Finding
torch

Unpinned Dependencies

Low
Category
Supply Chain
Content
dataclasses
numpy
torch
yaml
Confidence
99% confidence
Finding
yaml

Known Vulnerable Dependency: numpy — 10 advisory(ies): CVE-2014-1859 (Numpy arbitrary file write via symlink attack); CVE-2021-41495 (NumPy NULL Pointer Dereference); CVE-2021-33430 (NumPy Buffer Overflow (Disputed)) +7 more

Critical
Category
Supply Chain
Confidence
91% confidence
Finding
numpy

Known Vulnerable Dependency: torch — 10 advisory(ies): CVE-2025-2953 (PyTorch susceptible to local Denial of Service); CVE-2022-45907 (PyTorch vulnerable to arbitrary code execution); CVE-2025-32434 (PyTorch: `torch.load` with `weights_only=True` leads to remote code execution) +7 more

Critical
Category
Supply Chain
Confidence
96% confidence
Finding
torch

Possible Typosquatting: 'yaml' resembles popular package 'pyyaml'

High
Category
Supply Chain
Confidence
99% confidence
Finding
yaml

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal