Extract PDF Text

PassAudited by ClawScan on May 1, 2026.

Overview

This is a straightforward local PDF text-extraction guide, with the normal caution that it asks users to install PDF/OCR packages.

This skill appears safe and purpose-aligned for local PDF extraction. Before installing, use a trusted Python or system package source, prefer a virtual environment, and only run OCR package-manager commands if you actually need scanned-document support.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing the package may modify your Python environment and relies on the integrity of the package source you use.

Why it was flagged

The skill directs users to install an unpinned third-party Python package. This is central to the PDF extraction purpose, but it changes the local Python environment and depends on package-source trust.

Skill content

pip install PyMuPDF

Recommendation

Install in a virtual environment from a trusted package index, and pin or review versions if you need a controlled environment.

Note

ASI04: Agentic Supply Chain Vulnerabilities

What this means

If you run the OCR setup command, it can change system packages on your machine.

Why it was flagged

Optional OCR setup includes system package installation, which may require administrator privileges. It is directly related to OCR support and is presented as user-run setup, not automatic execution.

Skill content

sudo apt install tesseract-ocr

Recommendation

Run the OCR installation only if needed, use trusted OS repositories or package managers, and confirm the command before granting administrator privileges.