Back to skill

Security audit

完美排版ocr

Security checks across malware telemetry and agentic risk

Overview

This is a coherent PDF OCR helper, but it sends selected PDFs to an external PaddleOCR API and keeps intermediate OCR files locally.

Install only if you are comfortable sending the selected PDF to PaddleOCR and storing intermediate OCR text/PDF files in a local work directory. Use a virtual environment, provide a scoped OCR token, avoid highly sensitive documents unless the provider terms fit your needs, and delete the work directory after processing confidential files.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (4)

Context-Inappropriate Capability

Medium
Confidence
89% confidence
Finding
The pipeline dereferences inputImage URLs returned in OCR result data and downloads them server-side to determine dimensions. Because those URLs come from remote service output, this expands the trust boundary and can be abused for unintended outbound requests, including SSRF-style access to internal resources if the upstream service or result data is compromised.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The skill documentation omits an important warning that OCR is performed via external API jobs and that resumable state and OCR results are stored in a work directory. For potentially sensitive scanned PDFs, this omission can lead to unintentional disclosure of document contents to external services and leave recoverable artifacts on disk after processing.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The script uploads user-provided PDF contents to a remote OCR API without any in-band warning, consent check, or obvious disclosure at execution time. In many environments this is a significant privacy and data-governance risk because scanned PDFs often contain sensitive personal, financial, or proprietary information.

Missing User Warnings

Low
Confidence
86% confidence
Finding
The pipeline performs additional network retrievals for remote images referenced by OCR results, which is not obvious from the skill description. Even if intended for layout preservation, undisclosed secondary fetches increase the attack surface and may violate user expectations about what external communications occur.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.