MinerU Document Explorer

Security checks across malware telemetry and agentic risk

Overview

This PDF tool is useful for document search, but it may upload PDFs to a public remote service by default despite claiming network features are opt-in.

Install only if you are comfortable with PDFs and derived images/text being sent to the configured MinerU or other API endpoints. Before using confidential documents, change config.yaml to local mode or remove server_url, avoid entering API keys unless needed, and do not start the server unless you understand the bind address and authentication settings.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (41)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill declares only `network: true`, but its instructions also require reading and writing local files and using environment-backed credentials. This capability mismatch is dangerous because it hides the real trust boundary from users and policy systems, making silent file modification and credential-dependent behavior easier to trigger without clear review.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The skill is presented as a PDF-reading helper, but its documented behavior extends to server startup, remote API delegation, document upload, indexing, OCR, embeddings, reranking, and even processing markdown. That mismatch is dangerous because users may invoke it for local document reading while unintentionally causing network exfiltration, persistent indexing, or broader content processing than expected.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: Documenting and enabling a local FastAPI server materially increases attack surface beyond simple document parsing. Even if intended for convenience, exposing a local HTTP service can create unauthorized access paths, unsafe bindings, or data leakage if started automatically or configured insecurely.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The instruction to append lessons learned to `references/tips.md` after every PDF task introduces persistent file modification unrelated to the core user request. This is dangerous because it creates unauthorized state changes, can leak sensitive document-derived information into project files, and normalizes hidden writes during routine reading tasks.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The setup documentation expands the skill from local PDF reading into remote OCR, semantic search, element extraction, and optional LLM-driven outlining via external services. That materially changes the trust boundary because document contents may be transmitted off-host, which is more invasive than the stated PDF-reading/exploration purpose and can expose sensitive PDF data to third parties.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The PageIndex feature introduces external LLM access for hierarchical outline generation, which is not necessary for basic PDF reading and may send document-derived content to a third-party API. In a document-exploration skill, this creates avoidable confidentiality and compliance risk, especially for sensitive PDFs.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The skill is described as being required for reading and understanding PDF contents, but the tips explicitly instruct the agent to generate new deliverables such as HTML, PDF, and PPTX files. That expands the skill from passive document analysis into content/file creation, which can cause unauthorized side effects, policy bypass, or misuse in workflows that should remain read-only or extraction-only.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The CLI supports both remote client operation via `--server` and a `server` subcommand that starts an HTTP service, which expands the skill beyond its declared purpose of reading PDF contents. In an agent-skill context, this increases attack surface by enabling outbound network access to arbitrary endpoints and long-lived service exposure that could be abused for data exfiltration or unintended remote interactions.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The `server` command invokes `run_server`, allowing this PDF-reading tool to launch an HTTP server. For a skill whose stated function is document reading and understanding, the ability to open a listening service is unnecessary and dangerous because it can expose local document-processing capabilities or create an unexpected network foothold if triggered by an agent or attacker-controlled input.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: This module sends user query text and document-derived inputs to a remote embedding endpoint, which materially expands the skill's data exposure beyond a local PDF-reading expectation. In a PDF analysis skill, users may reasonably assume document contents stay local, so undisclosed transmission to an external service creates a real privacy and trust risk, especially for sensitive PDFs.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The helper functions instantiate a MinerU cloud client and send source documents/images to an external service for OCR/extraction, but the provided skill description only frames the capability as local PDF reading/understanding. Undisclosed third-party transmission of document contents can expose sensitive data and violate user expectations, especially for confidential PDFs.

Context-Inappropriate Capability

Medium

Confidence: 82% confidence
Finding: The adapter configures an external API endpoint and sends PDF-derived content into a model-backed indexing pipeline, which creates a data egress path beyond simple local PDF reading. For a skill described as reading and understanding PDFs, this is security-relevant because sensitive document contents may be transmitted to third-party services depending on configuration, and the broad `base_url` setting can redirect that data to arbitrary endpoints.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: This utility instantiates an OpenAI-compatible client and sends prompts/messages to an external service, which expands a PDF exploration skill into remote content processing. In this skill context, PDF contents may contain sensitive user data, and the manifest does not clearly disclose that document-derived content can leave the local environment, creating a real data exposure and scope-creep risk.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The code provides generic external chat-completions capability not strictly required for local PDF parsing or page extraction. In a document explorer, this broad remote AI interface increases attack surface and makes it easier for document text, prompts, or surrounding context to be sent off-box without strong justification or boundaries.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: The summary and description helpers directly embed node text and document structure into prompts sent to a remote LLM service. Because this is derived from user PDFs and the skill description frames the tool as a document reader/explorer, the undisclosed export of content is a concrete confidentiality issue.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code sends PDF-derived content to a remote reranker service by POSTing the full input payload, which includes user query text and document image data. In a skill whose stated purpose is reading and understanding local PDFs, this expands data handling to external transmission without any visible consent, trust boundary control, or restriction on destination, creating a real confidentiality risk.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The adapter converts local image files into data URLs and transmits them to a configurable remote API endpoint, effectively creating an exfiltration path for local document contents. Because the endpoint comes from configuration and there is no validation, user disclosure, or policy gate, this is especially risky in a PDF-reading skill that may process sensitive documents.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The client auto-creates a local TreeBuilder from configuration, and the comment explicitly contemplates using a GPT-backed PageIndex backend. That can cause PDF content to be sent to an external LLM/service outside the core PDF-reading server path, which expands the data-sharing surface beyond the skill's stated purpose and may surprise users or operators.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The example configuration explicitly defaults to client mode, which delegates PDF handling to a remote server. In a skill whose stated purpose is simply reading and understanding PDFs, this creates a real data-exposure risk because users may process sensitive documents off-device without realizing it from the skill framing alone.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: Preconfiguring a public remote server means the tool is ready to send document data off-host with minimal user action. That is dangerous for a document-exploration skill because PDFs commonly contain confidential business, legal, financial, or personal information, and the default behavior broadens the trust boundary beyond what many users would expect.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The skill manifest frames this as PDF reading/understanding, but the code also fetches arbitrary remote URLs and stores the content locally. That expands the trust boundary to external networks and untrusted files, creating SSRF-style access to attacker-controlled endpoints, unexpected data egress, and local persistence of downloaded content; in an agent context, this is more dangerous because a user may only expect local document analysis, not outbound network access.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: Making the skill 'REQUIRED for any task involving reading or understanding PDF contents' is an overly broad trigger that can route nearly all PDF interactions into a tool with network access, file writes, and optional remote services. Broad mandatory invocation increases the chance that users engage risky side effects without realizing a simpler, lower-privilege path was available.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The setup instructions direct the agent to write user-provided configuration into `SCRIPTS/doc-search/config.yaml` without a nearby warning or explicit confirmation for the file modification. This is dangerous because it can persist secrets and alter local state unexpectedly during what appears to be a normal document-reading workflow.

Missing User Warnings

Low

Confidence: 94% confidence
Finding: Mandating automatic modification of `references/tips.md` after every task causes hidden writes without user awareness. While lower impact than credential storage, it still creates unauthorized persistence, may contaminate the repository, and could accidentally retain sensitive operational details from prior documents.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The note states that extraction 'requires network (server-side LLM)', which implies document content may leave the local environment, but it does not explicitly warn users that PDF content, images, or extracted evidence may be transmitted to remote infrastructure for processing. In a document-exploration skill handling potentially sensitive PDFs, this can cause unintentional data exposure or policy violations because operators may assume processing is local unless clearly told otherwise.

VirusTotal

No VirusTotal findings

View on VirusTotal