pdf-miner

Security checks across malware telemetry and agentic risk

Overview

This PDF extraction skill needs review because it can automatically send PDF page images to an external OCR API even though its summary says scanned-PDF OCR is out of scope.

Install only if you are comfortable with remote OCR processing. For confidential PDFs, avoid configuring OCR credentials or run with --no-auto-ocr; if you do use OCR, prefer environment variables over storing keys in config.json and verify the configured OCR endpoint and model before processing sensitive documents.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (15)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 89% confidence
Finding: The skill documentation instructs use of environment variables plus reading and writing local files, but the manifest does not declare those capabilities. Hidden capability gaps matter because users and policy layers cannot accurately assess what the skill needs to access, especially when it handles PDFs and optional credentials. In this context the issue is transparency and control failure rather than direct code execution, but it increases risk when combined with OCR and external API usage.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The manifest says the skill is not for OCR on scanned PDFs, yet the body documents built-in OCR, automatic OCR triggering, and transmission of rendered page images to an external vision API. This mismatch can cause users or orchestrators to invoke the skill under false assumptions, leading to unanticipated network egress and exposure of sensitive document contents. Because PDFs often contain confidential business or financial data, the discrepancy materially increases harm.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The skill explicitly states it is not for OCR on scanned/image PDFs, but later adds full OCR support and even automatic OCR behavior. That inconsistency is dangerous because operators may rely on the manifest for data handling boundaries, while the skill can actually send image renderings of document pages to third-party services. In a document-processing skill, this hidden expansion of scope is a significant trust and privacy risk.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The skill introduces storage and handling of external vision API credentials in environment variables, config files, and command-line arguments, despite presenting itself primarily as a local PDF extraction tool. This broadens the attack surface by encouraging secret placement in files and shell history and by adding network dependency without strong justification in the manifest. The risk is amplified because users may not expect remote processing from a PDF reader skill.

Intent-Code Divergence

High

Confidence: 96% confidence
Finding: The documentation is self-contradictory: it first excludes OCR, then later describes OCR as automatic/default behavior for low-text pages. Automatic remote OCR is especially sensitive because it can activate without the user deliberately choosing it, causing confidential page images to leave the local environment unexpectedly. The contradiction undermines informed consent and safe deployment decisions.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The skill metadata and docstring state OCR/scanned PDFs are out of scope, yet the implementation includes OCR and enables auto-OCR by default. That causes low-text PDF pages to be rendered and potentially transmitted to an external API, expanding the skill's capabilities beyond what users would reasonably expect and creating an unexpected data exfiltration path for document contents.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The OCR function converts PDF pages to images and sends them to a remote vision model endpoint using the OpenAI-compatible client. This means potentially sensitive PDF contents leave the local environment and are disclosed to a third party, which is especially risky for financial, research, or internal documents the skill is designed to process.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The script loads OCR API keys, base URLs, and models from environment variables and a local config file to support remote OCR connectivity even though the skill is presented primarily as PDF extraction and says OCR is not for this skill. While reading configuration is common, in this context it silently prepares a networked data-transfer capability that users may not expect.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The code advertises OCR as optional, but runtime logic enables automatic OCR by default through the hidden auto_ocr flag. This discrepancy undermines informed consent and can trigger remote processing of document images without a clear, intentional user action.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The code explicitly implements OCR for scanned/image-based PDFs even though the skill metadata says the skill is NOT for OCR on scanned image-based PDFs. This scope mismatch is security-relevant because it enables processing and transmission of document image content in a way users and reviewers would not expect, increasing the chance of covert data exfiltration or unauthorized handling of sensitive files.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The function sends rendered PDF page images to a remote OpenAI-compatible endpoint, which can disclose the full visual contents of potentially sensitive PDFs to a third party. This is especially dangerous because the advertised purpose is local PDF extraction, so users may not expect external transmission of document data.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The module docstring advertises OCR of scanned/image-based PDFs despite the manifest saying such use is out of scope. This inconsistency can mislead integrators and reviewers, weakening trust boundaries and making it easier for sensitive document-processing behavior to be introduced without proper review.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The OCR flow sends rendered PDF page images to an external vision API, but the documentation does not prominently warn that document contents may be transmitted off-host to a third party. This is a meaningful privacy and compliance issue because PDFs commonly contain proprietary, personal, or regulated information, and automatic OCR increases the chance of accidental disclosure. The absence of explicit warning makes misuse more likely.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: In the OCR integration path, low-text pages are automatically selected and sent for OCR if credentials are available, but the user-facing output only notes that OCR is happening rather than clearly warning that page images are being transmitted off-box. For document-processing skills, this is a significant privacy and compliance issue because users may assume extraction is local.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Page images are uploaded to a remote vision API without any visible user-facing warning, consent check, or disclosure in this script. That creates a privacy and compliance risk because sensitive PDF contents may leave the local environment unexpectedly.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal