habib-pdf-to-json
PassAudited by ClawScan on May 1, 2026.
Overview
This is a coherent PDF extraction instruction skill, but users should install its PDF/OCR dependencies carefully and run it only on documents they intend to process.
This skill appears safe for its stated purpose. Before installing, verify the package identity because the metadata is inconsistent, install dependencies in an isolated Python environment, and process only PDFs whose contents you are comfortable extracting into local output files.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing these packages may affect the user's Python environment and relies on the integrity of the packages retrieved by pip.
The skill asks users to install unpinned third-party PDF/OCR Python packages. This is expected for a PDF extraction skill, but it changes the local Python environment and depends on package-source trust.
pip install pdfplumber pandas openpyxl ... pip install pytesseract pdf2image ... pip install pypdf
Install in a virtual environment, use trusted package indexes, and pin or review dependency versions if using this for sensitive or production workflows.
Sensitive construction information in PDFs could be written into Excel, CSV, JSON, or text outputs that remain on disk.
The examples read local PDF files and write extracted data to local output files. This is central to the skill's purpose, but users should be aware that document contents may be copied into new files.
with pdfplumber.open("construction_spec.pdf") as pdf: ... df.to_excel("extracted_data.xlsx", index=False)Run the extraction only on intended files, choose safe output locations, and handle generated files according to the sensitivity of the source documents.
Users may have less clarity about the exact package identity or publishing provenance.
The bundled metadata uses a different owner ID and slug than the registry metadata supplied for the review. This is a provenance/identity inconsistency, although no hidden code or harmful behavior is shown.
"ownerId": "kn75fhjxn1jz5xbgd9ggj0nrtd80q1dz", "slug": "pdf-to-structured"
Verify that the skill identity, owner, and version match the expected registry listing before installing or relying on it.
