pdf-ocr-layout

Security checks across malware telemetry and agentic risk

Overview

This skill does what it advertises: it sends user-selected documents to Zhipu GLM services for OCR and analysis, then saves extracted outputs locally.

Install only if you are comfortable sending the selected PDFs/images and extracted context to Zhipu cloud APIs. Do not process confidential, regulated, customer, or internal documents unless that sharing is approved, use a dedicated API key, and choose a secure output directory because extracted text and cropped images remain on disk.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (9)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The script initializes a remote Zhipu AI client using an API key from the environment, establishing the capability to export local document contents to a third-party service. In this file, that capability is actually exercised later on table, context, and image data, so the concern is not merely theoretical: sensitive local data may be transmitted off-host without access control, minimization, or consent flow.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: This code sends up to 6000 characters of full document context plus table contents to a remote model API. If the processed documents contain confidential research, internal reports, regulated data, or customer information, the script can exfiltrate that data to an external provider without any validation, redaction, or user warning.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The image-analysis path reads local image files, base64-encodes them, and uploads them together with surrounding document context to an external multimodal service. This is a direct data export path for potentially sensitive figures, diagrams, screenshots, or embedded secrets, and the inclusion of nearby text context broadens the scope of disclosure.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly describes sending extracted table text, full document context, cropped images, and base64-encoded image content to external Zhipu GLM APIs, but it does not warn users that potentially sensitive document contents leave the local environment. This creates a real privacy and data-governance risk because users may process confidential PDFs or images without informed consent about third-party transmission.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: The skill states that cropped image files and JSON reports are generated in the output directory, but it does not clearly warn users that running the pipeline writes potentially sensitive derived artifacts to disk. This can expose extracted content to other local users, backups, or downstream processes if the output path is shared or insufficiently protected.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill documentation describes OCR, table extraction, image cropping, and multimodal analysis using external GLM services, but it does not clearly warn users that document text and cropped images are transmitted to third-party APIs. This creates a real data-handling and privacy risk because users may supply sensitive PDFs or images without informed consent, especially given the tool’s purpose of processing potentially confidential business documents.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script sends the full image file, base64-encoded, to a third-party OCR API (`client.layout_parsing.create`) without any user-facing notice, consent prompt, or safeguard about remote transmission. This creates a real privacy and data-handling risk because users may process sensitive documents under the assumption that work is local, while the skill actually uploads content externally.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The code transmits image data and document context to a third-party AI service without any user-facing disclosure in the script flow. Even if remote analysis is intended, the lack of warning or consent increases the chance that users unknowingly expose sensitive content outside their local environment.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The table-analysis path sends document context and extracted table content to an external provider without informing the user. In document-processing tools, silent transmission of locally sourced content is a meaningful privacy and compliance risk, especially when users may assume processing is local.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal