mineru precision extract PDF、Document、Images

v0.2.1

MinerU precision extract — high-accuracy document extraction with full feature set. Convert PDFs, scanned documents, images, Word (DOC/DOCX), PowerPoint (PPT...

0· 176·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mineru-extract/mineru-precision-extract.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "mineru precision extract PDF、Document、Images" (mineru-extract/mineru-precision-extract) from ClawHub.
Skill page: https://clawhub.ai/mineru-extract/mineru-precision-extract
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: MINERU_TOKEN
Required binaries: mineru-open-api
Config paths to check: ~/.mineru/config.yaml
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install mineru-precision-extract

ClawHub CLI

Package manager switcher

npx clawhub@latest install mineru-precision-extract
Security Scan
Capability signals
Requires OAuth token
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description describe precision document extraction and the skill only requires the mineru-open-api binary, a MINERU_TOKEN, and ~/.mineru/config.yaml — all expected for a CLI that calls the MinerU service. The declared npm/go install targets produce the required binary and map to the stated purpose.
Instruction Scope
SKILL.md instructs the agent to run mineru-open-api commands (extract, crawl, auth) and to upload or process files/URLs via the MinerU service. This is appropriate for the stated purpose, but note that using the skill will send document data to mineru.net (the service's API) for processing — users should confirm privacy/retention policies before sending sensitive documents.
Install Mechanism
Install options are standard package installs: npm -g mineru-open-api or go install from a GitHub repo. These are normal, traceable mechanisms; there are no downloads from untrusted personal servers or opaque archives in the spec.
Credentials
Only MINERU_TOKEN and a config path (~/.mineru/config.yaml) are required, which is proportional and expected for authenticating to the MinerU API. No unrelated secrets or excessive environment variables are requested.
Persistence & Privilege
The skill does not request permanent/always-on inclusion (always:false) and does not attempt to modify other skills or system-wide settings. It is an invocation wrapper for the mineru-open-api CLI and does not claim elevated persistent privileges.
Assessment
This skill is an instruction-only wrapper around the mineru-open-api CLI and appears coherent with its stated purpose. Before installing or using it: 1) Confirm you trust mineru.net / the GitHub repo and review MinerU's privacy/retention policy — documents you process will be sent to their service for extraction. 2) Obtain MINERU_TOKEN from the vendor and be aware it may be stored in ~/.mineru/config.yaml or as an env var. 3) Install via npm or go only from the official package/repo and verify package authenticity (check publisher, repo, and package contents) if you will process sensitive data. 4) If you need on-prem / offline processing for sensitive documents, verify whether MinerU provides that option; otherwise avoid sending confidential data. 5) The skill can be invoked by the agent, so restrict usage/permissions if you do not want automated uploads of files.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📄 Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Config~/.mineru/config.yaml

Install

Install via npm
Bins: mineru-open-api
npm i -g mineru-open-api
Install via go install
Bins: mineru-open-api
latestvk97cf40jdf64hztxn2kcm2qmqh84d56j
176downloads
0stars
6versions
Updated 2w ago
v0.2.1
MIT-0

Precision Document Extraction with mineru-open-api

Full-featured document extraction with table/formula recognition, OCR, multi-format output, batch processing, and web crawling.

Why use extract?

  • Table recognition — accurately extracts tables from PDFs and images
  • Formula recognition — preserves mathematical formulas as LaTeX
  • Multi-format output — Markdown, HTML, LaTeX, DOCX, JSON
  • Model selection — choose vlm for highest accuracy or pipeline for zero-hallucination
  • Batch processing — process hundreds of files in one command
  • Web crawling — convert web pages to structured Markdown
  • All file formats — PDF, images, DOC, DOCX, PPT, PPTX, HTML
  • Higher limits — much larger file size and page count than quick mode
  • 80+ languages — full language coverage across all script families

Installation

npm install -g mineru-open-api

Or via Go (macOS/Linux):

go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Verify installation

mineru-open-api version

Authentication

Create a token at https://mineru.net/apiManage/token, then configure:

mineru-open-api auth                         # Interactive token setup
export MINERU_TOKEN="your-token"             # Or set via environment variable

Token resolution order: --token flag > MINERU_TOKEN env > ~/.mineru/config.yaml.

Quick start

mineru-open-api extract report.pdf                         # Markdown to stdout
mineru-open-api extract report.pdf -o ./out/               # Save to directory
mineru-open-api extract report.pdf -f md,html,docx -o ./   # Multi-format
mineru-open-api extract report.pdf --model vlm -o ./out/   # High-accuracy mode
mineru-open-api extract *.pdf -o ./results/                # Batch extract
mineru-open-api crawl https://example.com/article          # Web page → Markdown

Supported input formats

FormatSupported
PDF (.pdf)Yes
Images (.png, .jpg, .jpeg, .jp2, .webp, .gif, .bmp)Yes
Word (.doc, .docx)Yes
PowerPoint (.ppt, .pptx)Yes
HTML (.html)Yes
URLs (remote files)Yes

Commands

extract — Precision extraction

mineru-open-api extract <file-or-url> [...] [flags]

Examples

mineru-open-api extract report.pdf                         # Markdown to stdout
mineru-open-api extract report.pdf -f html                 # HTML to stdout
mineru-open-api extract report.pdf -o ./out/               # Save to directory
mineru-open-api extract report.pdf -o ./out/ -f md,docx    # Multiple formats
mineru-open-api extract report.pdf -f latex -o ./out/      # LaTeX output
mineru-open-api extract report.pdf --model vlm -o ./out/   # High-accuracy mode
mineru-open-api extract report.pdf --ocr -o ./out/         # OCR for scanned docs
mineru-open-api extract report.pdf --language en -o ./out/ # Specify language
mineru-open-api extract report.pdf --pages "1-10" -o ./out/  # Page range
mineru-open-api extract *.pdf -o ./results/                # Batch extract
mineru-open-api extract --list files.txt -o ./results/     # Batch from file list
mineru-open-api extract https://example.com/doc.pdf        # Extract from URL
cat doc.pdf | mineru-open-api extract --stdin -o ./out/    # From stdin

extract flags

FlagShortDefaultDescription
--output-o(stdout)Output path (file or directory)
--format-fmdOutput formats: md, json, html, latex, docx (comma-separated)
--model(auto)Model: vlm, pipeline, html (see below)
--ocrfalseEnable OCR for scanned documents
--formulatrueEnable/disable formula recognition
--tabletrueEnable/disable table recognition
--languagechDocument language
--pages(all)Page range, e.g. 1-10,15
--timeout900/1800Timeout in seconds (single/batch)
--listRead input list from file (one path per line)
--concurrency0Batch concurrency (0 = server default)

Model comparison: vlm vs pipeline

vlmpipeline
Parsing accuracyHigher — better at complex layouts, mixed contentStandard
Hallucination riskMay produce hallucinated text in rare casesNo hallucination — biggest advantage
Best forAcademic papers, complex tables, intricate layoutsGeneral documents where fidelity matters most

When the user values accuracy and the document has complex formatting, suggest --model vlm. When the user prioritizes reliability and no-hallucination guarantee, suggest --model pipeline (or omit --model to use auto).

crawl — Web page extraction

Fetch web pages and convert to structured Markdown.

mineru-open-api crawl https://example.com/article              # Markdown to stdout
mineru-open-api crawl https://example.com/article -f html      # HTML to stdout
mineru-open-api crawl https://example.com/article -o ./out/    # Save to file
mineru-open-api crawl url1 url2 -o ./pages/                    # Batch crawl
mineru-open-api crawl --list urls.txt -o ./pages/              # Batch from file list

crawl flags

FlagShortDefaultDescription
--output-o(stdout)Output path
--format-fmdOutput formats: md, json, html (comma-separated)
--timeout900/1800Timeout in seconds (single/batch)
--listRead URL list from file (one per line)
--stdin-listfalseRead URL list from stdin
--concurrency0Batch concurrency

auth — Authentication management

mineru-open-api auth              # Interactive token setup
mineru-open-api auth --verify     # Verify current token is valid
mineru-open-api auth --show       # Show current token source and masked value

Supported --language values

Values are organized by script/language family — each value covers all languages in its group.

Standalone language packs

ValueIncluded languages说明
chChinese, English, Chinese Traditional中英文(默认值)
ch_serverChinese, English, Chinese Traditional, Japanese繁体、手写体
enEnglish纯英文
japanChinese, English, Chinese Traditional, Japanese日文为主
koreanKorean, English韩文
chinese_chtChinese, English, Chinese Traditional, Japanese繁体中文为主
taTamil, English泰米尔文
teTelugu, English泰卢固文
kaKannada卡纳达文
elGreek, English希腊文
thThai, English泰文

Language family packs

ValueScript/FamilyIncluded languages
latinLatin script (拉丁语系)French, German, Afrikaans, Italian, Spanish, Bosnian, Portuguese, Czech, Welsh, Danish, Estonian, Irish, Croatian, Uzbek, Hungarian, Serbian (Latin), Indonesian, Occitan, Icelandic, Lithuanian, Maori, Malay, Dutch, Norwegian, Polish, Slovak, Slovenian, Albanian, Swedish, Swahili, Tagalog, Turkish, Latin, Azerbaijani, Kurdish, Latvian, Maltese, Pali, Romanian, Vietnamese, Finnish, Basque, Galician, Luxembourgish, Romansh, Catalan, Quechua
arabicArabic script (阿拉伯语系)Arabic, Persian, Uyghur, Urdu, Pashto, Kurdish, Sindhi, Balochi, English
cyrillicCyrillic script (西里尔语系)Russian, Belarusian, Ukrainian, Serbian (Cyrillic), Bulgarian, Mongolian, Abkhazian, Adyghe, Kabardian, Avar, Dargin, Ingush, Chechen, Lak, Lezgin, Tabasaran, Kazakh, Kyrgyz, Tajik, Macedonian, Tatar, Chuvash, Bashkir, Malian, Moldovan, Udmurt, Komi, Ossetian, Buryat, Kalmyk, Tuvan, Sakha, Karakalpak, English
east_slavicEast Slavic (东斯拉夫语系)Russian, Belarusian, Ukrainian, English
devanagariDevanagari script (天城文语系)Hindi, Marathi, Nepali, Bihari, Maithili, Angika, Bhojpuri, Magahi, Santali, Newari, Konkani, Sanskrit, Haryanvi, English

Global flags

FlagShortDescription
--tokenAPI token (overrides env and config)
--base-urlAPI base URL (for private deployments)
--verbose-vVerbose mode, print HTTP details

Output behavior

  • No -o flag: result goes to stdout; status/progress messages go to stderr
  • With -o flag: result saved to file/directory; progress messages on stderr
  • Batch mode (extract/crawl): requires -o to specify output directory
  • Binary formats (docx): cannot output to stdout, must use -o
  • Markdown output includes extracted images saved alongside the .md file

Agent guidelines

When using this skill on behalf of the user:

  • Quote file paths that contain spaces or special characters with double quotes. Example: mineru-open-api extract "report 01.pdf".
  • Don't run commands blindly on errors — explain the exit code and troubleshooting steps.
  • Installation questions ("mineru 怎么安装") should be answered with the install instructions above.
  • For stdout mode (no -o), only one text format can be output at a time. If the user wants multiple formats, suggest adding -o.
  • If the user hasn't authenticated yet, guide them to create a token at https://mineru.net/apiManage/token and run mineru-open-api auth.

Default output directory

When the user does NOT specify -o, generate a default output directory:

~/MinerU-Skill/<name>_<hash>/
  • <name>: derived from the source, then sanitized (replace spaces and shell-unsafe characters with _, collapse consecutive _).
    • For URLs: last path segment (e.g. https://arxiv.org/pdf/2509.221862509.22186)
    • For local files: filename without extension (e.g. report.pdfreport)
  • <hash>: first 6 characters of MD5 hash of the full original source.
echo -n "source" | md5sum | cut -c1-6   # Linux
echo -n "source" | md5 | cut -c1-6      # macOS

When the user specifies -o: use the user's path as-is.

Skill upgrade = CLI upgrade

When the user asks to upgrade this skill, re-install the CLI first:

npm install -g mineru-open-api@latest

Exit codes

CodeMeaningRecovery
0Success
1General API or unknown errorCheck network; retry; use --verbose
2Invalid parameters / usage errorCheck command syntax and flag values
3Authentication errorCreate or refresh token at https://mineru.net/apiManage/token, then run mineru-open-api auth
4File too large or page limit exceededSplit the file or use --pages
5Extraction failedDocument may be corrupted; try a different --model
6TimeoutIncrease with --timeout; large files may need 1600+ seconds

Troubleshooting

  • "no API token found": Run mineru-open-api auth or set MINERU_TOKEN env variable. Create token at https://mineru.net/apiManage/token.
  • Timeout on large files: Increase with --timeout 1600
  • Batch fails partially: Check stderr for per-file status; succeeded files are still saved
  • Binary format to stdout: Use -o flag; docx cannot stream to stdout
  • Private deployment: Use --base-url https://your-server.com/api
  • Extraction quality is poor: Try --model vlm for complex layouts, or --ocr for scanned documents
  • Tables not extracted correctly: Try --model vlm for better table recognition

Reporting Issues

Comments

Loading comments...