MinerU OCR Local & API
Parse complex PDFs and document images with MinerU, using either the hosted MinerU API or the local open-source MinerU runtime. Use when Codex needs MinerU-b...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
MinerU OCR Local API
Hard Rules
- Use
python scripts/mineru_caller.pyfor every MinerU request. - Do not parse the document yourself as a fallback.
- If MinerU returns an error, show the error and stop.
- Treat the saved JSON envelope and generated artifact files as the source of truth.
- Prefer the top-level
textfield when the user asks for the full extracted document.
Choose The Mode
- Use
--mode apiwhen the user wants the hosted MinerU service, already hasMINERU_API_TOKEN, needs URL input, or wants the official cloud API workflow. - Use
--mode localwhen the user wants the open-source MinerU runtime fromhttps://github.com/opendatalab/MinerU, wants data to stay local, or explicitly asks for local parsing. - Use
--mode autoonly when the user does not care which mode is used.autoprefers API whenMINERU_API_TOKENis configured and falls back to local only for local files.
Standard Workflow
-
For a hosted API parse from URL:
python scripts/mineru_caller.py --mode api --file-url "https://example.com/paper.pdf" --pretty -
For a hosted API parse from a local file:
python scripts/mineru_caller.py --mode api --file-path "C:/docs/paper.pdf" --pretty -
For a local open-source MinerU parse:
python scripts/mineru_caller.py --mode local --file-path "C:/docs/paper.pdf" --pretty -
When the input is a local file and the user will need an IDE-accessible path, prefer saving beside the source file:
python scripts/mineru_caller.py --mode local --file-path "C:/docs/paper.pdf" --download-dir "C:/docs/paper.mineru" --pretty -
Read these output fields before answering:
mode: actual execution mode usedtext: complete document Markdown fromfull.mdor<file_stem>.mdresult.submit: raw task-creation response for API URL parsingresult.batch: raw upload-batch response for API local-file parsingresult.poll: final API task-status payloadresult.local: local MinerU CLI invocation summaryartifacts.full_md_path: absolute path to the main Markdown fileartifacts.local_parse_dir: local MinerU parse directory when using--mode localartifacts.downloaded_zip: downloaded MinerU archive when using--mode api
Useful Flags
--mode api|local|auto: choose hosted API, local runtime, or automatic selection--no-wait: return after submission without polling; API mode only--no-download: skip downloadingfull_zip_url; API mode only--download-dir DIR: store API downloads or local MinerU outputs in a specific directory--language en: pass a language hint--ocr: force OCR mode--disable-formula: turn off formula extraction--local-cmd PATH: explicit path tomineru.exeormineru--local-python PATH: explicit Python path forpython -m mineru.cli.client--local-backend pipeline: choose the local MinerU backend--local-method auto|txt|ocr: choose the local MinerU parse method--local-model-source modelscope: useful in environments where Hugging Face access is restricted--local-device cpu: force a local inference device when needed
Present Results
- If the user asks for all text, show the top-level
textfield. - If the user asks where files were saved, report the paths in
artifacts. - If the output is large, give the saved file paths first and then the requested excerpt or summary.
- If API mode completed but archive download failed, report
artifacts.full_zip_url.
Configuration
For API mode:
MINERU_API_TOKEN
MINERU_API_BASE_URL=https://mineru.net
MINERU_API_TIMEOUT=60
MINERU_API_POLL_TIMEOUT=900
MINERU_API_POLL_INTERVAL=5
For local mode, configure at least one runtime entry point:
MINERU_LOCAL_CMD=C:\path\to\mineru.exe
MINERU_LOCAL_PYTHON=C:\path\to\python.exe
MINERU_LOCAL_BACKEND=pipeline
MINERU_LOCAL_METHOD=auto
MINERU_LOCAL_LANG=ch
MINERU_LOCAL_MODEL_SOURCE=modelscope
MINERU_LOCAL_DEVICE_MODE=cpu
MINERU_LOCAL_TIMEOUT=3600
Local mode only supports --file-path. It does not accept --file-url.
Error Handling
- Missing API token: show the configuration error and stop.
- Missing local runtime: show the configuration error and stop.
- Failed task or failed local CLI run: show the error and stop.
- Poll timeout: tell the user the task id and that polling timed out.
- API archive download TLS error: rely on the built-in
curlfallback before reporting failure. - Missing expected output files: return any artifact paths that do exist and report the missing output.
References
references/output_schema.md: JSON envelope and artifact layout for both modes.
Load the reference file when:
- You need to explain which saved files matter.
- You need to inspect mode-specific artifacts such as
downloaded_zip,local_parse_dir,middle_json, orcontent_list.
Verification
Validate the skill folder:
python C:/Users/shangfukai/.codex/skills/.system/skill-creator/scripts/quick_validate.py D:/shangfukai/dev/skills/mineru-ocr-local-api-1.1.0
Check configuration only:
python scripts/smoke_test.py --mode api --skip-run
python scripts/smoke_test.py --mode local --skip-run
Run a local end-to-end test:
python scripts/smoke_test.py --mode local --test-file "D:/path/to/file.pdf"
Run an API end-to-end test:
python scripts/smoke_test.py --mode api --test-url "https://example.com/file.pdf"
Files
9 totalComments
Loading comments…
