Extract text from PDF files with automatic OCR fallback for scanned/image-based PDFs. Use when: (1) a user sends a PDF file and the framework did not auto-inject text content, (2) the injected text is empty or garbled, (3) a PDF file exists on disk and needs text extraction, (4) user mentions "read PDF", "extract PDF", "PDF content", "scan PDF", "OCR". Handles both text-layer PDFs (fast pdftotext) and scanned/image PDFs (tesseract OCR). Supports Chinese + English by default, configurable languages.

Install

openclaw skills install @panpeter2024/pdf-reader