HTML OCR
v0.4.0OCR for HTML pages containing image-embedded or scanned content. Uses MinerU to extract text from images within HTML files and web pages. Features: OCR extra...
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
The skill is an instruction-only wrapper around the mineru-open-api CLI and declares the exact binary and MINERU_TOKEN credential it needs. The declared dependencies (mineru-open-api via npm or Go) and the MINERU_TOKEN credential align with the stated HTML OCR purpose.
Instruction Scope
SKILL.md instructs the agent to run mineru-open-api on local HTML files and to authenticate with MINERU_TOKEN or interactive auth. This stays within the OCR purpose. Note: running the CLI will upload page content/images to MinerU's service (expected for a remote OCR API), so private/sensitive content may be transmitted externally.
Install Mechanism
Install options are npm (mineru-open-api) or go install from a GitHub path. These are expected for a CLI tool and are not downloads from arbitrary URLs. Installing a global npm package or running go install executes third-party code at install time — standard but worth auditing if you don't trust the publisher.
Credentials
The only required environment variable is MINERU_TOKEN (declared as primaryEnv). That is proportionate for an API-backed OCR CLI. No unrelated credentials or extra config paths are requested.
Persistence & Privilege
The skill is user-invocable, not always-on, and does not request special system persistence or cross-skill configuration. It uses normal CLI invocation and environment token-based auth.
Assessment
This skill appears to do what it says: it runs the MinerU CLI and requires a MINERU_TOKEN. Before installing, consider: (1) MINERU_TOKEN is sensitive — only provide it if you trust MinerU/mineru-open-api and the organization. (2) OCRing local HTML/images will likely upload those files to MinerU servers — avoid sending sensitive documents. (3) Global npm packages and go installs execute third-party code at install time; review the mineru-open-api package source or GitHub repo (https://github.com/opendatalab/MinerU) if you need higher assurance. (4) If you decide to proceed, run the CLI in an isolated environment/container and grant the token minimal scope; revoke the token if you suspect misuse.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
📄 Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Primary envMINERU_TOKEN
Install
Install via npm
Bins: mineru-open-api
npm i -g mineru-open-apiInstall via go install
Bins: mineru-open-api
SKILL.md
HTML OCR
Use OCR to extract text from HTML files that contain scanned images or image-embedded content using MinerU.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# OCR extraction from local HTML file (requires token)
mineru-open-api extract page.html --ocr -o ./out/
# With VLM model for better accuracy
mineru-open-api extract page.html --ocr --model vlm -o ./out/
Authentication
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Supported input: local .html file
- OCR requires
extractwith token — not available inflash-extract - Use
--ocrflag to enable OCR on image-embedded content in HTML - Use
--model vlmfor complex or mixed-content pages
Notes
- HTML is NOT supported by
flash-extract; useextractwith token - If the HTML has normal text content, OCR is not needed — use
html-extractinstead - Output goes to stdout by default; use
-o <dir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Files
1 totalSelect a file
Select a file to preview.
Comments
Loading comments…
