MinerU OCR Local & API
PassAudited by VirusTotal on May 11, 2026.
Overview
Type: OpenClaw Skill Name: mineru-ocr-local-api Version: 1.1.4 The skill bundle provides a legitimate interface for the MinerU OCR service, supporting both local CLI execution and the hosted API. The code in `scripts/lib.py` and `scripts/mineru_caller.py` uses standard libraries like `subprocess` and `httpx` to handle document parsing and file transfers, which are consistent with the stated purpose. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found; the use of environment variables for API tokens and local paths is standard for this type of integration.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
When API mode is used, the skill can submit OCR jobs to MinerU using your configured token.
Hosted API mode uses a MinerU token as a bearer credential, including a MINERU_ACCESS_TOKEN fallback. This is expected for the integration but gives the skill token-backed MinerU API authority.
token = _get_env("MINERU_API_TOKEN", "MINERU_ACCESS_TOKEN") ... "Authorization": f"Bearer {config.token}"Use a token intended only for MinerU, keep it out of shared logs or profiles, and unset it if you want to force local-only behavior.
Confidential PDFs or images may be sent to the hosted MinerU service if API mode is selected.
The hosted workflow can transmit local document bytes to the MinerU API. This is disclosed and purpose-aligned, but it means document contents leave the local machine.
Hosted local-file flow starts with `POST /api/v4/file-urls/batch`, uploads bytes to `data.file_urls[]`, and polls `GET /api/v4/extract-results/batch/{batch_id}`Use `--mode local` for documents that should not leave your device, and only set `MINERU_API_BASE_URL` to a trusted endpoint.
A configured local MinerU executable will run on your machine when local mode is used.
Local mode runs an external MinerU runtime/CLI. This is central to the stated local OCR purpose, but users should ensure the executable they configure is trusted.
Local open-source flow invokes the official `mineru` CLI from `https://github.com/opendatalab/MinerU`
Install MinerU from a trusted source and avoid pointing `MINERU_LOCAL_CMD` or `--local-cmd` at untrusted executables.
OCR outputs may remain in temp or output folders after the task, including any sensitive text or instructions contained in the document.
The saved envelope can include complete extracted Markdown text, so sensitive document contents may persist on disk and later be reused as context.
`mineru_caller.py` returns a stable JSON envelope around MinerU execution and, by default, saves that envelope to a unique file under the system temp directory.
Treat extracted document text as data rather than commands, choose output locations carefully, and delete artifacts when they are no longer needed.
