MinerU OCR Local & API

PassAudited by VirusTotal on May 11, 2026.

Overview

Type: OpenClaw Skill Name: mineru-ocr-local-api Version: 1.1.4 The skill bundle provides a legitimate interface for the MinerU OCR service, supporting both local CLI execution and the hosted API. The code in `scripts/lib.py` and `scripts/mineru_caller.py` uses standard libraries like `subprocess` and `httpx` to handle document parsing and file transfers, which are consistent with the stated purpose. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found; the use of environment variables for API tokens and local paths is standard for this type of integration.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

When API mode is used, the skill can submit OCR jobs to MinerU using your configured token.

Why it was flagged

Hosted API mode uses a MinerU token as a bearer credential, including a MINERU_ACCESS_TOKEN fallback. This is expected for the integration but gives the skill token-backed MinerU API authority.

Skill content
token = _get_env("MINERU_API_TOKEN", "MINERU_ACCESS_TOKEN") ... "Authorization": f"Bearer {config.token}"
Recommendation

Use a token intended only for MinerU, keep it out of shared logs or profiles, and unset it if you want to force local-only behavior.

What this means

Confidential PDFs or images may be sent to the hosted MinerU service if API mode is selected.

Why it was flagged

The hosted workflow can transmit local document bytes to the MinerU API. This is disclosed and purpose-aligned, but it means document contents leave the local machine.

Skill content
Hosted local-file flow starts with `POST /api/v4/file-urls/batch`, uploads bytes to `data.file_urls[]`, and polls `GET /api/v4/extract-results/batch/{batch_id}`
Recommendation

Use `--mode local` for documents that should not leave your device, and only set `MINERU_API_BASE_URL` to a trusted endpoint.

What this means

A configured local MinerU executable will run on your machine when local mode is used.

Why it was flagged

Local mode runs an external MinerU runtime/CLI. This is central to the stated local OCR purpose, but users should ensure the executable they configure is trusted.

Skill content
Local open-source flow invokes the official `mineru` CLI from `https://github.com/opendatalab/MinerU`
Recommendation

Install MinerU from a trusted source and avoid pointing `MINERU_LOCAL_CMD` or `--local-cmd` at untrusted executables.

What this means

OCR outputs may remain in temp or output folders after the task, including any sensitive text or instructions contained in the document.

Why it was flagged

The saved envelope can include complete extracted Markdown text, so sensitive document contents may persist on disk and later be reused as context.

Skill content
`mineru_caller.py` returns a stable JSON envelope around MinerU execution and, by default, saves that envelope to a unique file under the system temp directory.
Recommendation

Treat extracted document text as data rather than commands, choose output locations carefully, and delete artifacts when they are no longer needed.