Security audit

Gugu Gaga

Security checks across malware telemetry and agentic risk

Overview

The skill’s main document-analysis workflow is coherent, but it needs review because some packaged output assets and converter code allow broader network or third-party processing than the user-facing scope clearly explains.

Review before installing in sensitive environments. Use only local PDF/DOC/DOCX/TXT files, avoid enabling MarkItDown cloud or plugin flags, and treat generated HTML previews as potentially network-active because fonts and some chart templates load third-party resources.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (44)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill advertises a document-analysis workflow but declares no permissions while its behavior implies broad capabilities including shell, file I/O, network access, and environment access. This is dangerous because users and policy engines cannot accurately assess or constrain what the skill can do, increasing the risk of unintended file access, command execution, or external data exfiltration during processing.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The stated purpose is narrow—analyzing pharma regulatory documents and producing PPTX/PDF outputs—but the observed behavior reportedly includes fetching remote URLs, processing many unrelated content types, calling cloud extraction services, using multimodal/LLM services, and exporting arbitrary HTML slides. That mismatch is dangerous because it masks a much larger attack surface and creates opportunities for silent data transfer to third parties, processing of untrusted remote content, and use of the tool beyond the user's reasonable expectations.

Context-Inappropriate Capability

Low

Confidence: 91% confidence
Finding: The stylesheet imports multiple fonts from Google Fonts, which causes the skill to make external network requests when rendering output. For a document-analysis skill, this network capability is not necessary to core functionality and can leak usage metadata such as IP address, timing, and document/viewing context to a third party, while also creating a supply-chain and availability dependency.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The file implements a generic HTML slide deck and presenter runtime, which is materially unrelated to the declared pharmaceutical regulation analysis skill. This kind of hidden scope expansion increases supply-chain risk because it introduces unexpected UI execution, popup behavior, inter-window messaging, and local state handling that users would not reasonably expect from a document-analysis tool.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code opens a presenter popup and synchronizes state across windows using BroadcastChannel and postMessage, which broadens the attack surface beyond simple offline report generation. In the context of a pharma document-analysis skill, these browser-runtime capabilities are unexpected and can expose slide notes, navigation state, and rendered content to other same-origin pages or components, especially in shared hosting environments.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The CLI can enumerate and enable third-party plugins via Python entry points, extending processing with externally installed code. In an agent/skill context, plugin execution materially expands the trust boundary and can lead to arbitrary code execution or unsafe data handling if untrusted or unnecessary plugins are present.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The code accepts arbitrary http/https/file/data URIs and fetches remote content, which exceeds the stated skill scope of processing uploaded regulation documents. In an agent setting this expands the attack surface to SSRF, unintended network access, and ingestion of untrusted remote content without clear restriction or scope control.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The plugin loader discovers and imports entry points from the runtime environment, meaning externally installed packages can inject code into the conversion pipeline. In a skill whose declared purpose is narrow document analysis, this creates a broad arbitrary-code execution extension point that can add hidden capabilities or abuse host access.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: When plugins are enabled, the code executes each plugin's register_converters() logic, which is arbitrary Python from external packages. That permits unreviewed code execution and registration of converters with capabilities unrelated to regulation-document analysis, undermining trust boundaries for the skill.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This converter reads arbitrary uploaded file bytes and sends them to Azure Content Understanding via begin_analyze_binary(), which is a real data-egress behavior. In the context of a skill described as a pharmaceutical-regulation analysis tool, transmitting potentially sensitive regulatory, internal, or proprietary documents to a third-party cloud service without clear scope alignment or disclosure creates a meaningful confidentiality and compliance risk.

Context-Inappropriate Capability

Medium

Confidence: 83% confidence
Finding: The file type support is much broader than the stated purpose: it accepts audio, video, email, spreadsheets, HTML, XML, and images in addition to text-like regulatory documents. That mismatch increases attack surface and raises the chance that users or downstream agents will process unrelated or sensitive content through this skill, including media and mailbox data not justified by the declared use case.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: This converter transmits full document bytes to Azure Document Intelligence via a remote API, which creates a real data-exfiltration/privacy boundary that is not apparent from a generic local document-conversion interface alone. In a regulatory/pharma-analysis context, documents may contain confidential or regulated content, so undisclosed cloud submission materially increases risk even if the code's purpose is functional rather than malicious.

Context-Inappropriate Capability

Low

Confidence: 80% confidence
Finding: The code automatically falls back to ambient Azure credentials from environment/default identity, which can cause unexpected use of a user's cloud identity and accidental access to external services without explicit configuration. While common in Azure SDK usage, it weakens least-surprise and least-privilege expectations for a document-analysis component.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The converter includes functionality to describe images with a multimodal LLM, which expands the skill's effective capability beyond the stated PDF/DOCX/TXT regulatory-document analysis purpose. In this skill context, that scope drift is security-relevant because it can cause users to unintentionally process non-manifest data types and send sensitive visual content to downstream model logic.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This code base64-encodes the full image and submits it to an external chat completion API, creating a direct data-exfiltration path for potentially sensitive document images, screenshots, or embedded personal/regulatory information. In the context of a pharmaceutical regulatory analysis skill, files may contain confidential or regulated material, so undeclared outbound transfer materially increases risk.

Description-Behavior Mismatch

High

Confidence: 90% confidence
Finding: This function implements generic image captioning by sending image content to an external LLM, which does not align with the declared pharmaceutical regulation document-analysis purpose of the skill. That mismatch matters because hidden or unnecessary capabilities increase the risk of undocumented data handling and make users and reviewers less able to assess what content may leave the local environment.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: The code base64-encodes the entire file stream and sends it to an external chat completion API, creating a direct data exfiltration path for file contents. In the context of a pharmaceutical-regulation analysis tool, uploaded files may contain proprietary, regulated, or sensitive material, so undocumented outbound transmission materially increases confidentiality and compliance risk.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The converter conditionally sends embedded slide image data to an external LLM service for caption generation, which is a capability beyond basic PPTX-to-Markdown conversion and can expose document contents to a third party. In a regulatory-analysis skill, slides may contain confidential, proprietary, or regulated visual content, so this undocumented outbound data flow creates a real data-exposure risk.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The image-captioning path introduces external model interaction that is not justified by the stated skill purpose of structured regulatory document analysis and report generation. Unnecessary model-assisted processing expands the attack surface and increases the chance that sensitive slide imagery is transmitted, retained, or logged outside the expected processing boundary.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill metadata claims analysis of PDF/DOCX/TXT regulatory documents, but this file adds audio transcription capability and therefore expands the effective processing scope beyond what users would reasonably expect. Scope mismatch is dangerous because it can enable undisclosed collection and handling of sensitive content, especially in regulated environments where users rely on the manifest to understand permitted data flows.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The call to recognize_google sends audio-derived content to an external Google service, creating an undeclared network data exfiltration path. In a regulatory-analysis skill, uploaded materials may include confidential discussions, personal data, or proprietary information, so transmitting content to a third party without explicit justification or controls presents a serious confidentiality and compliance risk.

Context-Inappropriate Capability

High

Confidence: 76% confidence
Finding: The converter performs network-dependent transcript retrieval for YouTube content via an external library, which introduces outbound connectivity and data flow not aligned with a document-analysis skill expected to handle local PDF/DOCX/TXT inputs. In restricted or privacy-sensitive environments, this can violate deployment assumptions, leak access patterns, and expand the attack surface through unsolicited external requests.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The template content is materially unrelated to the declared pharmaceutical-regulation analysis purpose and instead contains lifestyle/social-media sleep advice. In an agent skill, this kind of purpose drift is dangerous because it can cause the system to generate misleading outputs, route regulated-source inputs into irrelevant templates, or conceal undeclared behavior that users and reviewers would not expect.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code transmits raw binary_input=file_bytes to Azure Content Understanding but contains no in-file warning, consent check, or disclosure mechanism. For a document-analysis skill that may handle regulated or proprietary content, silent upload can violate user expectations and organizational data-handling requirements even if the remote service is legitimate.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The converter sends document content to Azure Document Intelligence without any user-facing warning in this code path, so users may unknowingly upload sensitive files to a third-party cloud service. In a pharma/regulatory skill, that raises confidentiality and compliance concerns because source documents may contain proprietary or controlled information.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Static analysis

No suspicious patterns detected.