PDF Reader (Iyeque)

Extract text, search inside PDFs, and produce summaries.

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 4 · 898 · 4 current installs · 4 all-time installs

by@iyeque

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the code and instructions: the skill uses PyMuPDF to extract text and return metadata from a provided PDF path. Required binary (python3) and the pip dependency (PyMuPDF) are appropriate and expected.

ℹ

Instruction Scope

SKILL.md restricts actions to running the included reader.py on a user-specified file path and to installing PyMuPDF via pip. The implementation only reads the supplied file and prints results; it does not access other system files or environment variables. Note: the README says it 'supports encrypted PDFs' but the code will simply error if a password is required (it does not try to supply passwords).

✓

Install Mechanism

No custom download or unpack step; installation is via pip install PyMuPDF as declared in SKILL.md metadata. PyPI is a standard source; nothing in the manifest points to untrusted URLs or archive extraction.

✓

Credentials

The skill requests no environment variables or credentials. It only requires a local file path to a PDF, which is proportionate to its stated functionality.

✓

Persistence & Privilege

always is false and the skill does not request persistent system modifications or modify other skills. Autonomous model invocation is allowed by platform default but not combined with broad access here.

Assessment

This skill appears to do only what it says: read a PDF you point it at and return text or metadata. Before installing: (1) be aware it will read any file path you provide, so don’t point it at sensitive files or system directories; (2) avoid running it on untrusted/malicious PDFs without sandboxing—PDF libraries have had vulnerabilities in the past; (3) the skill installs PyMuPDF from PyPI—confirm you’re comfortable installing that dependency; and (4) if you expect password-protected PDFs to be handled, note the code currently fails rather than attempting password unlocking.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.1.0

Download zip

latestvk976rxbhsvpt9zcvw6dvaxx3an81adpm

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

📄 Clawdis

Binspython3

SKILL.md

PDF Reader Skill

The pdf-reader skill provides functionality to extract text and retrieve metadata from PDF files using PyMuPDF (fitz).

Tool API

The skill provides two commands:

extract

Extracts plain text from the specified PDF file.

Parameters:
- file_path (string, required): Path to the PDF file to extract text from.
- --max_pages (integer, optional): Maximum number of pages to extract.

Usage:

python3 skills/pdf-reader/reader.py extract /path/to/document.pdf
python3 skills/pdf-reader/reader.py extract /path/to/document.pdf --max_pages 5

Output: Plain text content from the PDF.

metadata

Retrieve metadata about the document.

Parameters:
- file_path (string, required): Path to the PDF file.

Usage:

python3 skills/pdf-reader/reader.py metadata /path/to/document.pdf

Output: JSON object with PDF metadata including:

title: Document title
author: Document author
subject: Document subject
creator: Application that created the PDF
producer: PDF producer
creationDate: Creation date
modDate: Modification date
format: PDF format version
encryption: Encryption info (if any)

Implementation Notes

Uses PyMuPDF (imported as pymupdf) for fast, reliable PDF processing
Supports encrypted PDFs (will return error if password required)
Handles large PDFs efficiently with max_pages option
Returns structured JSON for metadata command

Example

# Extract text from first 3 pages
python3 skills/pdf-reader/reader.py extract report.pdf --max_pages 3

# Get document metadata
python3 skills/pdf-reader/reader.py metadata report.pdf
# Output:
# {
#   "title": "Annual Report 2024",
#   "author": "John Doe",
#   "creationDate": "D:20240115120000",
#   ...
# }

Error Handling

Returns error message if file not found or not a valid PDF
Returns error if PDF is encrypted and requires password
Gracefully handles corrupted or malformed PDFs

Files

2 total

Select a file

Select a file to preview.

Comments

Loading comments…