PDF Learning Workflow

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate PDF-to-study-notes workflow, but it uses external OCR and CDN services that users should understand before processing sensitive documents.

Install only if you are comfortable sending scanned PDF page images to ZhipuAI/GLM-OCR and opening generated HTML that loads KaTeX from jsDelivr. Avoid confidential, regulated, or copyright-sensitive documents unless that external processing is acceptable, and protect the API key file with normal local file permissions.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (4)

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The generated HTML unconditionally loads KaTeX JavaScript and CSS from a third-party CDN. Opening the exported file causes network access and execution of remotely hosted script in the browser, which creates supply-chain and privacy risks that exceed a local-only PDF/OCR/note workflow; additionally, any unescaped Markdown content would execute in the same DOM context as those scripts.

Description-Behavior Mismatch

Medium
Confidence
93% confidence
Finding
The script sends local page images to a third-party OCR API, which contradicts the apparent expectation of a local PDF-to-notes workflow. This is dangerous because scanned PDFs often contain sensitive personal, academic, or proprietary material, and users are not clearly warned that their document contents leave the machine.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The README instructs users to use a third-party GLM-OCR API for processing PDF content but does not warn that document pages and potentially sensitive book or personal data will be transmitted off-host to an external service. This creates a privacy and data-handling risk because users may unknowingly upload confidential or copyrighted material to a remote provider.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The code transmits OCR images to an external API without any user-facing notice, consent gate, or data-handling explanation. This creates a privacy and compliance risk because users may reasonably believe document processing is local while the service receives raw page images.

VirusTotal

56/56 vendors flagged this skill as clean.

View on VirusTotal