Security audit

unisound-surgery-sufficiency-review

Security checks across malware telemetry and agentic risk

Overview

This medical-record review skill is mostly purpose-aligned, but it sends sensitive case text to configurable external services and can save prepared patient text locally despite claiming it does not persist data.

Install only in an environment approved for PHI handling. Use --no-llm or use_llm=false unless the model endpoint is contractually approved, avoid arbitrary --base values, configure GUIDELINE_API_BASE to a trusted service, and do not use --save-prepared with real patient data unless local retention, access controls, and cleanup are explicitly managed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (9)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill declares no permissions, yet its documented behavior includes environment variable access, local file read/write, and network calls. In a medical-record processing context, these capabilities materially increase the risk of unintended PHI exposure, policy bypass, and unsafe execution because operators may trust the skill as lower-privilege than it actually is.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The description frames the skill as a structured-record sufficiency review, but the documented behavior expands into external guideline retrieval, external LLM inference, broad document ingestion, and optional local saving of prepared medical text. This mismatch is dangerous because it can cause users or orchestrators to send sensitive records under incorrect assumptions about data flow, storage, and third-party disclosure.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code can persist preprocessed medical record text to disk via `save_prepared`, which may include sensitive patient data. In a medical-review skill, writing full prepared records to local storage increases privacy and compliance risk because the data can remain on disk, be accessed by other users/processes, or be collected by backups and logs beyond the intended review flow.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The `--save-prepared` CLI option exposes a debug path that writes preprocessed patient text to local files, creating an unnecessary data-exposure surface for sensitive medical content. Because this tool handles clinical records, even optional debug persistence is risky in real environments where operators may enable it without understanding retention, access, or compliance consequences.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The skill is presented as a surgery sufficiency reviewer over supplied case data, but in normal operation it forwards selected medical document content and guideline text to an external LLM for decision-making. Because the transmitted content includes clinical record text and the service is remote, this creates a real confidentiality and data-governance risk, especially for PHI/medical records.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The payload accepts arbitrary 'base' and 'appkey' values and then sends prompt content built from case documents to that user-specified endpoint. This is effectively an unconstrained exfiltration sink for sensitive medical records and can also be abused for SSRF-like outbound access to attacker-controlled or unauthorized internal services.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The skill depends on an external guideline API configured through environment variables, which introduces undisclosed outbound data flow and trust in remote content for clinical review logic. While the queried data here is surgery code/scope rather than full records, it still expands the attack surface and can affect integrity, availability, and deployment transparency.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: Sensitive prepared medical text is written to disk without any explicit warning, consent workflow, or visible safeguards around PHI handling. In healthcare context this is more dangerous than generic debug output because patient records are highly sensitive and unauthorized persistence can trigger privacy, regulatory, and insider-access exposure.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The prompt constructed for the LLM includes clinical document excerpts, surgery details, and guideline text, and this content is later posted over HTTP to a remote chat completion endpoint. There is no built-in consent, warning, or explicit disclosure in the code path that sensitive record contents are being transmitted externally, which is dangerous in a medical context.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.