pdf-ocr-extraction
Analysis
This skill is a straightforward local OCR helper, with visible dependency installation and temporary file handling that users should understand before use.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.
uv | package: pypdfium2 pytesseract Pillow
The skill depends on external Python packages without pinned versions. These packages are purpose-aligned for PDF rendering and OCR, but users should rely on trusted package sources.
Create a Python script (e.g., `extract.py`) ... Then execute the script: `python3 extract.py /path/to/document.pdf`
The skill's workflow involves creating and running a local Python script. This is clearly disclosed and central to the OCR purpose.
tmp_img = f"/tmp/page_{i}.png" ... os.remove(tmp_img)The example writes rendered PDF pages to predictable temporary image paths and then deletes them. This is purpose-aligned but creates short-lived copies of document pages.
