GLM Multimodal Analyzer

Security checks across malware telemetry and agentic risk

Overview

The skill does its stated media-analysis job, but it needs Review because it builds a shell command from user-supplied text and can upload local files to a third-party API without a clear consent warning.

Install only if you are comfortable sending selected files, URLs, and prompts to Zhipu/BigModel with your ZHIPU_API_KEY. Avoid confidential or regulated documents unless approved. The publisher should replace the shell-string tool handler with structured argument execution or robust escaping, and add a clear upload/retention disclosure plus explicit consent for local files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill documentation indicates use of environment variables and outbound network access to an external API, but it does not declare permissions for those capabilities. This creates a transparency and governance gap: users and hosting platforms may not realize the skill can exfiltrate provided content and secrets-derived data over the network, especially when local file paths and external URLs are accepted as input.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README instructs users to submit images, videos, and documents to a third-party multimodal API but does not clearly warn that file contents, URLs, and possibly sensitive metadata will be transmitted off-host to an external service. This can lead users to unknowingly upload confidential or regulated data, especially because the skill is framed as a general-purpose analyzer and supports local files as input.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The invocation examples use broad natural-language phrases that could cause accidental or ambiguous activation in larger agent workflows. If auto-routing or trigger matching is permissive, ordinary conversation mentioning images, videos, PDFs, URLs, or paths could invoke the skill unexpectedly and send sensitive user content to the external multimodal service.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill accepts URLs and local file paths for multimodal analysis and documents use of an external API endpoint, but it does not warn users that supplied content may be transmitted off-system. This is dangerous because users may unknowingly expose private documents, internal file contents, presigned URLs, or sensitive media to a third party, creating confidentiality and compliance risks.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The tool can transmit arbitrary local files, images, videos, and documents to an external third-party API, including by converting local files into data URLs and embedding their contents in the request payload. There is no clear, explicit user-facing consent or warning at execution time that local content will leave the machine, which creates a meaningful privacy and data-handling risk in an agent skill context.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal