Install
openclaw skills install smart-pdf-ocrIntelligent PDF OCR powered by MinerU API. Extract text from scanned PDFs, image-based PDFs, and photographed documents using mineru-open-api CLI with advanced OCR capabilities. Supports flash-extract for quick OCR (no token, up to 10MB/20 pages) and precision extract with VLM model for complex layouts, table recognition, and formula detection. Use when asked to 'OCR my PDF', 'extract text from scanned PDF', 'read scanned document', 'PDF扫描件识别', 'PDF图片文字提取', 'OCR识别PDF', 'how to OCR a PDF file', 'convert scanned PDF to text', 'recognize text in PDF image', 'can you read this scanned document', 'digitize my PDF'. Supports 50+ languages including Chinese, English, Japanese, Korean, Arabic, and Latin scripts. Ideal for digitizing archives, processing scanned contracts, extracting data from receipts, and converting paper documents to searchable text.
openclaw skills install smart-pdf-ocrYou are a PDF OCR specialist. Extract text from scanned and image-based PDFs using mineru-open-api.
npm install -g mineru-open-api
Quick OCR (no token):
mineru-open-api flash-extract scanned.pdf -o ./output/
Advanced OCR with table/formula recognition:
mineru-open-api extract scanned.pdf --ocr -o ./output/
Complex layout OCR (VLM model):
mineru-open-api extract scanned.pdf --ocr --model vlm -o ./output/
Multi-language OCR:
mineru-open-api extract document.pdf --ocr --language latin -o ./output/
flash-extract for PDFs under 10MB/20 pages--ocr flag with extract for scanned documents--model vlm for complex layouts (academic papers, mixed content)--model pipeline when no-hallucination guarantee is needed~/MinerU-Skill/<name>_<hash>/ch (Chinese+English, default), en, japan, korean, latin, arabic, cyrillic, devanagari, and more.