Image OCR
v0.4.0OCR for photos and images using MinerU. Extract text from photographs, screenshots, camera captures, and image files with high accuracy. Features: image OCR...
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (image OCR via MinerU) match the declared binary (mineru-open-api) and the commands documented in SKILL.md. Required binary and primary credential are appropriate for an OCR CLI wrapper.
Instruction Scope
SKILL.md only instructs the agent to run mineru-open-api commands, set or use MINERU_TOKEN for authenticated calls, and points to mineru.net/GitHub for tokens and source. It does not direct the agent to read unrelated files, exfiltrate data to unexpected endpoints, or access other environment variables.
Install Mechanism
Install options are standard: an npm package (mineru-open-api) and a Go 'go install' from an OpenDataLab GitHub repo. These are expected for a CLI. As with any third‑party package, installing from npm or go pulls code onto the host and should be verified (package page, repository, checksums/tags).
Credentials
Only MINERU_TOKEN is required and is the primary credential; SKILL.md documents that some commands (flash-extract) work without a token while extract requires it. Requesting a single service token is proportional to the skill's features.
Persistence & Privilege
always:false and normal autonomous invocation behavior. The skill does not request persistent system-wide privileges or modify other skills' configs in the instructions.
Assessment
This skill appears coherent, but follow standard precautions before installing: verify the npm package and the GitHub repo (publisher identity, recent commits, stars/issues) to reduce risk of typosquatting or malicious packages; only provide MINERU_TOKEN if you trust the service and give the token least privilege; if you prefer not to supply credentials, use 'flash-extract' (no token) for small quick OCR; review what the installed mineru-open-api binary does (source code) before running on sensitive images, and revoke the token if you observe unexpected behavior.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🖼️ Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Primary envMINERU_TOKEN
Install
Install via npm
Bins: mineru-open-api
npm i -g mineru-open-apiInstall via go install
Bins: mineru-open-api
SKILL.md
Image OCR
Extract text and content from images using MinerU. Supports photos, screenshots, scanned documents, and any image containing text.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# Quick OCR from image (no token required)
mineru-open-api flash-extract photo.png
# Save to directory
mineru-open-api flash-extract screenshot.jpg -o ./out/
# From URL
mineru-open-api flash-extract https://example.com/image.png
# Specify language (default: ch)
mineru-open-api flash-extract photo.png --language en
# Precision OCR with token (better accuracy, no size limit)
mineru-open-api extract photo.png --ocr -o ./out/
# With VLM model for complex layouts or mixed content
mineru-open-api extract photo.png --ocr --model vlm -o ./out/
Authentication
No token needed for flash-extract. Token required for extract:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Supported input: .png, .jpg, .jpeg, .jp2, .webp, .gif, .bmp (local file or URL)
flash-extract: quick OCR, no token, max 10 MB / 20 pages, Markdown outputextract: token required, higher accuracy with--ocr, supports--model vlmfor complex images- Language hint with
--language(default:ch, useenfor English documents) - Formula recognition available via
extract --formula - Table recognition available via
extract --table
Notes
- For scanned documents or low-quality images, use
extract --ocr --model vlmfor best results flash-extractalready applies OCR automatically on images — no extra flag needed- Output goes to stdout by default; use
-o <dir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Files
1 totalSelect a file
Select a file to preview.
Comments
Loading comments…
