Extract PDF Text
PassAudited by ClawScan on May 1, 2026.
Overview
This is a straightforward local PDF text-extraction guide, with the normal caution that it asks users to install PDF/OCR packages.
This skill appears safe and purpose-aligned for local PDF extraction. Before installing, use a trusted Python or system package source, prefer a virtual environment, and only run OCR package-manager commands if you actually need scanned-document support.
Findings (2)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing the package may modify your Python environment and relies on the integrity of the package source you use.
The skill directs users to install an unpinned third-party Python package. This is central to the PDF extraction purpose, but it changes the local Python environment and depends on package-source trust.
pip install PyMuPDF
Install in a virtual environment from a trusted package index, and pin or review versions if you need a controlled environment.
If you run the OCR setup command, it can change system packages on your machine.
Optional OCR setup includes system package installation, which may require administrator privileges. It is directly related to OCR support and is presented as user-run setup, not automatic execution.
sudo apt install tesseract-ocr
Run the OCR installation only if needed, use trusted OS repositories or package managers, and confirm the command before granting administrator privileges.
