PDF Reader (Iyeque)

Extract text, search inside PDFs, and produce summaries.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
4 · 898 · 4 current installs · 4 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the code and instructions: the skill uses PyMuPDF to extract text and return metadata from a provided PDF path. Required binary (python3) and the pip dependency (PyMuPDF) are appropriate and expected.
Instruction Scope
SKILL.md restricts actions to running the included reader.py on a user-specified file path and to installing PyMuPDF via pip. The implementation only reads the supplied file and prints results; it does not access other system files or environment variables. Note: the README says it 'supports encrypted PDFs' but the code will simply error if a password is required (it does not try to supply passwords).
Install Mechanism
No custom download or unpack step; installation is via pip install PyMuPDF as declared in SKILL.md metadata. PyPI is a standard source; nothing in the manifest points to untrusted URLs or archive extraction.
Credentials
The skill requests no environment variables or credentials. It only requires a local file path to a PDF, which is proportionate to its stated functionality.
Persistence & Privilege
always is false and the skill does not request persistent system modifications or modify other skills. Autonomous model invocation is allowed by platform default but not combined with broad access here.
Assessment
This skill appears to do only what it says: read a PDF you point it at and return text or metadata. Before installing: (1) be aware it will read any file path you provide, so don’t point it at sensitive files or system directories; (2) avoid running it on untrusted/malicious PDFs without sandboxing—PDF libraries have had vulnerabilities in the past; (3) the skill installs PyMuPDF from PyPI—confirm you’re comfortable installing that dependency; and (4) if you expect password-protected PDFs to be handled, note the code currently fails rather than attempting password unlocking.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.1.0
Download zip
latestvk976rxbhsvpt9zcvw6dvaxx3an81adpm

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

📄 Clawdis
Binspython3

SKILL.md

PDF Reader Skill

The pdf-reader skill provides functionality to extract text and retrieve metadata from PDF files using PyMuPDF (fitz).

Tool API

The skill provides two commands:

extract

Extracts plain text from the specified PDF file.

  • Parameters:
    • file_path (string, required): Path to the PDF file to extract text from.
    • --max_pages (integer, optional): Maximum number of pages to extract.

Usage:

python3 skills/pdf-reader/reader.py extract /path/to/document.pdf
python3 skills/pdf-reader/reader.py extract /path/to/document.pdf --max_pages 5

Output: Plain text content from the PDF.

metadata

Retrieve metadata about the document.

  • Parameters:
    • file_path (string, required): Path to the PDF file.

Usage:

python3 skills/pdf-reader/reader.py metadata /path/to/document.pdf

Output: JSON object with PDF metadata including:

  • title: Document title
  • author: Document author
  • subject: Document subject
  • creator: Application that created the PDF
  • producer: PDF producer
  • creationDate: Creation date
  • modDate: Modification date
  • format: PDF format version
  • encryption: Encryption info (if any)

Implementation Notes

  • Uses PyMuPDF (imported as pymupdf) for fast, reliable PDF processing
  • Supports encrypted PDFs (will return error if password required)
  • Handles large PDFs efficiently with max_pages option
  • Returns structured JSON for metadata command

Example

# Extract text from first 3 pages
python3 skills/pdf-reader/reader.py extract report.pdf --max_pages 3

# Get document metadata
python3 skills/pdf-reader/reader.py metadata report.pdf
# Output:
# {
#   "title": "Annual Report 2024",
#   "author": "John Doe",
#   "creationDate": "D:20240115120000",
#   ...
# }

Error Handling

  • Returns error message if file not found or not a valid PDF
  • Returns error if PDF is encrypted and requires password
  • Gracefully handles corrupted or malformed PDFs

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…