{"skill":{"slug":"iyeque-pdf-reader","displayName":"PDF Reader (Iyeque)","summary":"Extract text, search inside PDFs, and produce summaries.","description":"---\nname: pdf-reader\ndescription: Extract text, search inside PDFs, and produce summaries.\nhomepage: \"https://pymupdf.readthedocs.io\"\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"📄\",\n        \"requires\": { \"bins\": [\"python3\"], \"pip\": [\"PyMuPDF\"] },\n        \"install\":\n          [\n            {\n              \"id\": \"pymupdf\",\n              \"kind\": \"pip\",\n              \"package\": \"PyMuPDF\",\n              \"label\": \"Install PyMuPDF\",\n            },\n          ],\n        \"version\": \"1.1.0\",\n      },\n  }\n---\n\n# PDF Reader Skill\n\nThe `pdf-reader` skill provides functionality to extract text and retrieve metadata from PDF files using PyMuPDF (fitz).\n\n## Tool API\n\nThe skill provides two commands:\n\n### extract\nExtracts plain text from the specified PDF file.\n\n- **Parameters:**\n  - `file_path` (string, required): Path to the PDF file to extract text from.\n  - `--max_pages` (integer, optional): Maximum number of pages to extract.\n\n**Usage:**\n```bash\npython3 skills/pdf-reader/reader.py extract /path/to/document.pdf\npython3 skills/pdf-reader/reader.py extract /path/to/document.pdf --max_pages 5\n```\n\n**Output:** Plain text content from the PDF.\n\n### metadata\nRetrieve metadata about the document.\n\n- **Parameters:**\n  - `file_path` (string, required): Path to the PDF file.\n\n**Usage:**\n```bash\npython3 skills/pdf-reader/reader.py metadata /path/to/document.pdf\n```\n\n**Output:** JSON object with PDF metadata including:\n- `title`: Document title\n- `author`: Document author\n- `subject`: Document subject\n- `creator`: Application that created the PDF\n- `producer`: PDF producer\n- `creationDate`: Creation date\n- `modDate`: Modification date\n- `format`: PDF format version\n- `encryption`: Encryption info (if any)\n\n## Implementation Notes\n\n- Uses **PyMuPDF** (imported as `pymupdf`) for fast, reliable PDF processing\n- Supports encrypted PDFs (will return error if password required)\n- Handles large PDFs efficiently with `max_pages` option\n- Returns structured JSON for metadata command\n\n## Example\n\n```bash\n# Extract text from first 3 pages\npython3 skills/pdf-reader/reader.py extract report.pdf --max_pages 3\n\n# Get document metadata\npython3 skills/pdf-reader/reader.py metadata report.pdf\n# Output:\n# {\n#   \"title\": \"Annual Report 2024\",\n#   \"author\": \"John Doe\",\n#   \"creationDate\": \"D:20240115120000\",\n#   ...\n# }\n```\n\n## Error Handling\n\n- Returns error message if file not found or not a valid PDF\n- Returns error if PDF is encrypted and requires password\n- Gracefully handles corrupted or malformed PDFs\n","tags":{"latest":"1.1.0"},"stats":{"comments":0,"downloads":2379,"installsAllTime":11,"installsCurrent":11,"stars":4,"versions":2},"createdAt":1771064906270,"updatedAt":1778489533630},"latestVersion":{"version":"1.1.0","createdAt":1771326951970,"changelog":"Fixed SKILL.md to match actual PyMuPDF implementation. Corrected API documentation.","license":null},"metadata":{"setup":[],"os":null,"systems":null},"owner":{"handle":"iyeque","userId":"s17c41bhb28dycyxnec6jt83w184se82","displayName":"iyeque","image":"https://avatars.githubusercontent.com/u/183353210?v=4"},"moderation":null}