modelshow

Security checks across malware telemetry and agentic risk

Overview

ModelShow appears legitimate, but it sends prompts to many models and mandatorily saves full results while overstating its blind-evaluation privacy guarantees.

Review the configuration before installing. Avoid using this skill with secrets, regulated data, private code, or confidential documents unless you are comfortable sending them to all configured models and saving full copies locally. Keep the output directory private and use the optional web indexer only when you intentionally want those result files copied into a public-facing location.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (15)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill documentation instructs the agent to read configuration files and write result artifacts, yet the skill declares no permissions. This creates a trust and review gap: operators may enable a skill believing it is non-persistent and low-risk, while it actually performs filesystem access and stores user/model content.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The stated purpose is blind model comparison, but the documented behavior extends to persistent storage, index generation, copying to web/public locations, and retention-based deletion. That mismatch is dangerous because users may provide sensitive prompts or files for comparison without realizing the skill also republishes, catalogs, or prunes data beyond the immediate task.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The workflow makes saving every original prompt, full model response, judge output, and anonymization map mandatory. This defeats user expectations around 'blind' comparison and creates durable records of potentially sensitive inputs, generated content, and identity mappings that could later be exposed or misused.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Optional web publishing and index updating are outside the core need of blind response evaluation and materially increase the exposure surface. Even if framed as optional, bundling publication workflows into the skill encourages propagation of prompts, responses, and metadata to broader audiences or less controlled directories.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The code logs the full placeholder-to-model mapping with `logging.info`, which directly defeats the blind judging guarantee by exposing hidden identities to anyone with log access. In this skill's context, preserving anonymity is the core security/privacy property, so leaking the mapping undermines the trust model and can bias evaluation or reveal sensitive model usage.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The `anonymize` action returns both the anonymization map and reverse map alongside blind responses, giving the caller everything needed to immediately deanonymize outputs. While this may be convenient for orchestration, it collapses the separation between blind evaluation and reveal phases, making accidental or intentional disclosure far easier.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The configuration explicitly sets "includeAnonymizationKey": true while the skill advertises blind or double-blind model comparison. Exposing the anonymization key defeats de-anonymization controls and lets users or downstream tooling map blinded labels back to model identities, which can bias evaluation, leak private benchmark behavior, or undermine the trustworthiness of comparison results.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The code explicitly resolves aliases to full model names and persists them in Markdown/JSON outputs, including an anonymization audit map. In a skill whose stated purpose is blind multi-model comparison, this defeats privacy and anonymity guarantees and can expose sensitive local model/provider choices to disk or downstream consumers.

Context-Inappropriate Capability

Low

Confidence: 87% confidence
Finding: The script accepts an arbitrary output_dir from input and writes prompt, responses, and metadata there without restriction. If an attacker can influence input, this enables arbitrary file write within the user's permissions, which can overwrite sensitive files, place data in unexpected locations, or increase exposure of saved evaluation content.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README explicitly states that prompts are sent to multiple models and that both Markdown and JSON results are saved to disk, but it does not prominently warn users that sensitive prompts and model outputs may be disclosed to several backends and persisted locally. This can cause unintended exposure of confidential data, especially when users test proprietary code, credentials, or internal documents through the skill.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill mandates saving prompts and full outputs but does not provide a clear user-facing warning at the point of use that their content will be persisted. This undermines informed consent and can lead to unintentional storage of secrets, proprietary code, personal data, or fetched external context.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Logging the anonymization mapping exposes the hidden correspondence between placeholders and model identities without any warning or access boundary. Because this skill is specifically designed for double-blind comparison, the context makes the leak more dangerous: it invalidates the claimed architectural de-anonymization controls and can influence judging outcomes.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script writes prompts, model responses, judge notes, and metadata to disk automatically, with no user-facing notice or consent mechanism. In practice this can persist sensitive prompts, proprietary outputs, or internal model usage details, creating confidentiality and compliance risks if users assume the comparison is ephemeral.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The documented context-handling plus later persistence means externally fetched content, user prompts, and model outputs may all be written to local files. This expands the sensitivity boundary from the user's direct prompt to any referenced URLs, files, or preferences, increasing the chance of retaining confidential or regulated data.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The mandatory save flow normalizes retention of complete session artifacts after every run, including ranked results, full response text, and metadata. Requiring persistence as part of task completion increases the likelihood that sensitive or unnecessary data accumulates over time and becomes available to other local processes or future publication steps.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal