Install
openclaw skills install html-ocrOCR for HTML pages containing image-embedded or scanned content. Uses MinerU to extract text from images within HTML files and web pages. Features: OCR extraction for image content in HTML files. VLM mode for complex mixed-content pages. Handles HTML with embedded scanned images. Converts image text to searchable Markdown. Use when you need to: OCR images in HTML pages, extract text from image-heavy web pages, read scanned content embedded in HTML. Use when asked: 'how do I OCR an HTML page', 'extract text from images in HTML', 'this web page has images instead of text', 'can my agent OCR HTML content', 'is there a skill for HTML OCR'. Built on MinerU by OpenDataLab (Shanghai AI Lab) with advanced OCR capabilities. Perfect for web archiving, accessibility improvements, and content extraction from image-heavy web pages.
openclaw skills install html-ocrUse OCR to extract text from HTML files that contain scanned images or image-embedded content using MinerU.
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
# OCR extraction from local HTML file (requires token)
mineru-open-api extract page.html --ocr -o ./out/
# With VLM model for better accuracy
mineru-open-api extract page.html --ocr --model vlm -o ./out/
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
extract with token — not available in flash-extract--ocr flag to enable OCR on image-embedded content in HTML--model vlm for complex or mixed-content pagesflash-extract; use extract with tokenhtml-extract instead-o <dir> to save to a file or directory