PDF Processor

Security checks across malware telemetry and agentic risk

Overview

This PDF skill mostly does what it says, but it should be reviewed because it can start a local Ollama service in the background and mutates local PDF files without an explicit confirmation step.

Install only if you are comfortable with a local Python workflow that reads PDF contents, writes translations and summaries, moves the original PDF into a completed folder, deletes temporary extraction files, stores progress in plaintext while running, and may start Ollama in the background. Use a dedicated folder, back up important originals first, and start or manage Ollama yourself before processing sensitive documents.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (10)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: print("\n🚀 正在启动Ollama服务...") try: subprocess.Popen(['ollama', 'serve'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) time.sleep(5)
Confidence: 90% confidence
Finding: subprocess.Popen(['ollama', 'serve'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill clearly describes capabilities to read and write local files, invoke shell commands, and communicate with a local network service, yet no permissions are declared. This creates a trust and containment problem: users or the hosting platform cannot accurately understand or constrain what the skill can do, and the undocumented capabilities include file modification and command execution.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 90% confidence
Finding: The documented purpose is PDF processing, but the skill also describes additional behaviors such as indexing processed papers, moving original files, managing checkpoint files, and potentially starting a local Ollama service. Even if these behaviors are not overtly malicious, the mismatch reduces informed consent and can hide side effects that matter for privacy, data integrity, and local system changes.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The skill description says it extracts, detects language, translates, and summarizes PDFs, but the code also moves the original PDF and deletes the intermediate extracted-text file. This mismatch is security-relevant because users may not expect file mutation or deletion from a processing skill, increasing risk of accidental data loss or workflow disruption.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: Automatically starting a local model server and creating a child process exceeds the advertised scope of a simple PDF-processing skill. Even though it targets a local dependency, this introduces execution behavior that operators may not authorize and can bypass expectations about what the skill is allowed to do.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The trigger phrases are broad enough that the skill may activate on vague requests like '处理PDF' or when files appear in a directory, without a clear boundary on scope or confirmation. Overbroad activation can cause unintended processing of sensitive documents or unexpected file operations, especially because this skill also modifies, moves, and deletes files.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill documents automatic movement of PDFs, deletion of temporary extracted text, and cleanup of progress files, but does not present these as risky file-modifying operations requiring explicit warning or consent. In a local document-processing context, silent file changes can lead to accidental data loss, broken workflows, and mishandling of sensitive research documents.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The script moves the source PDF and deletes an intermediate file without explicit confirmation, backup prompt, or dry-run mode. In a file-processing skill, silent modification of user data is dangerous because mistakes, path confusion, or interrupted workflows can cause data loss or make files hard to locate.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill sends extracted document content to a local HTTP service without clearly warning the user that full text is being transmitted to another process. Even though the endpoint is localhost, this is still inter-process data exposure and may leak sensitive academic, proprietary, or personal document contents to logs, plugins, or other local users depending on system configuration.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The resume/progress feature stores translated text and processing state in plaintext JSON under the working directory, and the script also writes extracted document text to disk elsewhere. For sensitive PDFs, this creates persistent local copies that may outlive the main workflow and be accessible to other users, backups, or indexing tools.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal