Medpilot

Security checks across malware telemetry and agentic risk

Overview

This health-tracking skill mostly matches its stated purpose, but it includes examples that cross into medical interpretation and care recommendations despite saying it is not for diagnosis or treatment decisions.

Review this carefully before installing. It may be useful for logging medications, vitals, and doctor orders, but its examples encourage medical interpretation and follow-up advice that should come from a licensed clinician. Avoid using it for diagnosis, test-result interpretation, medication changes, or care decisions unless the skill is revised with stronger safety and privacy boundaries.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (5)

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The README includes an example that interprets medical findings ('TI-RADS 3' malignancy risk, LDL guidance, and specialist referral), which goes beyond the declared scope of medication tracking and enters diagnosis-like clinical interpretation. In a self-use health skill, this can mislead users into relying on the tool for medical judgment despite the manifest explicitly disallowing diagnosis and treatment decisions.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The example directly contradicts the skill metadata's prohibition on diagnosis and treatment decisions by providing risk interpretation, lifestyle intervention advice, and a referral recommendation. This inconsistency is dangerous because README examples often shape runtime behavior or user expectations, increasing the chance the agent will perform out-of-scope medical guidance in real use.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill explicitly claims it should not perform diagnosis or treatment decisions, but the documented scenario provides concrete interpretation of exam findings and specific follow-up recommendations. In a health workflow context, this can mislead users into treating the assistant as a medical advisor, creating unsafe reliance, delayed care, or inappropriate self-management based on unverified guidance.

Intent-Code Divergence

High

Confidence: 95% confidence
Finding: The boundaries section says the tool is not a diagnosis system, but the examples directly instruct diagnostic-style interpretation and care recommendations. This contradiction weakens safeguards because downstream agents may follow the concrete example over the abstract boundary, causing unsafe medical advice in a sensitive domain.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The skill is designed to collect and manage sensitive health information, including medications and home metrics, but it does not state any privacy, storage, retention, or disclosure constraints. In a health-data workflow, this omission increases the risk of over-collection, accidental exposure, or inappropriate transmission of personally sensitive information.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal