Install
openclaw skills install mineru-aiMinerU AI document parser — intelligent document extraction powered by AI. Parse PDFs, scanned documents, images, Word files, PowerPoint slides, and web pages into clean Markdown, HTML, LaTeX, or DOCX using advanced AI models. Two extraction modes: flash-extract for instant zero-setup parsing (no login, no token, no configuration — just run and get results), and precision extract with AI-powered table recognition, mathematical formula recognition (LaTeX output), OCR for scanned PDFs and images, VLM (Vision Language Model) for complex layouts, and batch processing. Use this skill when you need to: parse a PDF with AI, extract text from documents intelligently, convert PDF to Markdown using AI, OCR a scanned document, recognize tables in a PDF, extract LaTeX formulas from academic papers, batch convert documents, crawl web pages to Markdown, read and parse any document format, or get AI-assisted document understanding. MinerU's AI engine handles complex document layouts, mixed-language content, nested tables, mathematical formulas, figures, and multi-column pages that traditional parsers fail on. Choose vlm model for highest accuracy or pipeline model for zero-hallucination reliability. Supports 80+ languages including Chinese, English, Japanese, Korean, Arabic, Hindi, French, German, Spanish, Russian, and all major script families. Works with local files and URLs. Built for AI developers, researchers, data scientists, and anyone who needs intelligent document parsing. Works as a Claude Code skill, MCP tool, or standalone CLI. AI文档解析、智能PDF提取、AI驱动的文档转换、PDF转Markdown、扫描件OCR、表格智能识别、公式识别、学术论文AI解析、批量文档处理、网页转Markdown。MinerU AI引擎,支持复杂排版、多语言、嵌套表格、数学公式,传统解析器无法处理的文档都能轻松搞定。
openclaw skills install mineru-aiIntelligent document extraction powered by AI — parse any document format into clean, structured output.
npm install -g mineru-open-api
Or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
mineru-open-api version
flash-extract | extract | |
|---|---|---|
| Token required | No | Yes (mineru-open-api auth) |
| Speed | Fast | Normal |
| Table recognition | No | Yes |
| Formula recognition | No | Yes |
| OCR | Yes | Yes |
| Output formats | Markdown only | md, html, latex, docx, json |
| Batch mode | No | Yes |
| Model selection | pipeline | Yes (vlm, pipeline, MinerU-HTML) |
| File size limit | 10 MB | Much higher |
| Page limit | 20 pages | Much higher |
| Rate limit | Per-IP per-minute cap | Based on API plan |
| Best for | Quick start, small/simple docs | Large docs, tables, production |
| Limit | Value |
|---|---|
| File size | Max 10 MB |
| Page count | Max 20 pages |
| Supported types | PDF, Images (png/jpg/jpeg/jp2/webp/gif/bmp), Docx, PPTx |
| IP rate limit | Per-minute request caps (HTTP 429 when exceeded) |
When any limit is exceeded, the agent should suggest switching to extract with a token (create at https://mineru.net/apiManage/token), which has significantly higher limits.
mineru-open-api flash-extract <file> for quick Markdown conversionmineru-open-api auth, then use mineru-open-api extract for tables, formulas, OCR, multi-format, and batchmineru-open-api crawl <url> to convert web content-o directoryOnly required for extract and crawl. Not needed for flash-extract.
Configure your API token (create one at https://mineru.net/apiManage/token):
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or set via environment variable
Token resolution order: --token flag > MINERU_TOKEN env > ~/.mineru/config.yaml.
| Format | flash-extract | extract |
|---|---|---|
PDF (.pdf) | Yes | Yes |
Images (.png, .jpg, .jpeg, .jp2, .webp, .gif, .bmp) | Yes | Yes |
Word (.docx) | Yes | Yes |
Word (.doc) | No | Yes |
PowerPoint (.pptx) | Yes | Yes |
PowerPoint (.ppt) | No | Yes |
HTML (.html) | No | Yes |
| URLs (remote files) | Yes | Yes |
The crawl command accepts any HTTP/HTTPS URL and extracts web page content.
Fast, token-free document extraction. Outputs Markdown only. No table recognition. Limited to 10 MB / 20 pages per file, with IP-based rate limiting.
mineru-open-api flash-extract report.pdf # Markdown to stdout
mineru-open-api flash-extract report.pdf -o ./out/ # Save to file
mineru-open-api flash-extract https://example.com/doc.pdf # URL mode
mineru-open-api flash-extract report.pdf --language en # Specify language
mineru-open-api flash-extract report.pdf --pages 1-10 # Page range
| Flag | Short | Default | Description |
|---|---|---|---|
--output | -o | (stdout) | Output path (file or directory) |
--language | ch | Document language | |
--pages | (all) | Page range, e.g. 1-10 | |
--timeout | 900 | Timeout in seconds |
Convert PDFs, images, and other documents to Markdown or other formats. Supports table/formula recognition, OCR, multiple output formats, and batch mode.
mineru-open-api extract report.pdf # Markdown to stdout
mineru-open-api extract report.pdf -f html # HTML to stdout
mineru-open-api extract report.pdf -o ./out/ # Save to directory
mineru-open-api extract report.pdf -o ./out/ -f md,docx # Multiple formats
mineru-open-api extract *.pdf -o ./results/ # Batch extract
mineru-open-api extract --list files.txt -o ./results/ # Batch from file list
mineru-open-api extract https://example.com/doc.pdf # Extract from URL
cat doc.pdf | mineru-open-api extract --stdin -o ./out/ # From stdin
| Flag | Short | Default | Description |
|---|---|---|---|
--output | -o | (stdout) | Output path (file or directory) |
--format | -f | md | Output formats: md, json, html, latex, docx (comma-separated) |
--model | (auto) | Model: vlm, pipeline, html (see below) | |
--ocr | false | Enable OCR for scanned documents | |
--formula | true | Enable/disable formula recognition | |
--table | true | Enable/disable table recognition | |
--language | ch | Document language | |
--pages | (all) | Page range, e.g. 1-10,15 | |
--timeout | 900/1800 | Timeout in seconds (single/batch) | |
--list | Read input list from file (one path per line) | ||
--concurrency | 0 | Batch concurrency (0 = server default) |
vlm | pipeline | |
|---|---|---|
| Parsing accuracy | Higher — better at complex layouts, mixed content | Standard |
| Hallucination risk | May produce hallucinated text in rare cases | No hallucination — biggest advantage |
| Best for | Academic papers, complex tables, intricate layouts | General documents where fidelity matters most |
When the user values accuracy and the document has complex formatting, suggest --model vlm. When the user prioritizes reliability and no-hallucination guarantee, suggest --model pipeline (or omit --model to use auto).
Fetch web pages and convert to Markdown.
mineru-open-api crawl https://example.com/article # Markdown to stdout
mineru-open-api crawl https://example.com/article -f html # HTML to stdout
mineru-open-api crawl https://example.com/article -o ./out/ # Save to file
mineru-open-api crawl url1 url2 -o ./pages/ # Batch crawl
mineru-open-api crawl --list urls.txt -o ./pages/ # Batch from file list
| Flag | Short | Default | Description |
|---|---|---|---|
--output | -o | (stdout) | Output path |
--format | -f | md | Output formats: md, json, html (comma-separated) |
--timeout | 900/1800 | Timeout in seconds (single/batch) | |
--list | Read URL list from file (one per line) | ||
--stdin-list | false | Read URL list from stdin | |
--concurrency | 0 | Batch concurrency |
mineru-open-api auth # Interactive token setup
mineru-open-api auth --verify # Verify current token is valid
mineru-open-api auth --show # Show current token source and masked value
--language valuesThe --language flag accepts the following values (default: ch). Used by both flash-extract and extract. Values are organized by script/language family — each value covers all languages listed in its group.
| Value | Included languages |
|---|---|
ch | Chinese, English, Chinese Traditional |
ch_server | Chinese, English, Chinese Traditional, Japanese |
en | English |
japan | Chinese, English, Chinese Traditional, Japanese |
korean | Korean, English |
chinese_cht | Chinese, English, Chinese Traditional, Japanese |
ta | Tamil, English |
te | Telugu, English |
ka | Kannada |
el | Greek, English |
th | Thai, English |
| Value | Script/Family | Included languages |
|---|---|---|
latin | Latin script | French, German, Italian, Spanish, Portuguese, Dutch, Swedish, and 40+ more |
arabic | Arabic script | Arabic, Persian, Uyghur, Urdu, Pashto, Kurdish, and more |
cyrillic | Cyrillic script | Russian, Ukrainian, Bulgarian, Serbian, Kazakh, and 20+ more |
east_slavic | East Slavic | Russian, Belarusian, Ukrainian, English |
devanagari | Devanagari script | Hindi, Marathi, Nepali, Sanskrit, and more |
-o flag: result goes to stdout; status/progress messages go to stderr-o flag: result saved to file/directory; progress messages on stderrextract/crawl only): requires -o to specify output directorydocx, extract only): cannot output to stdout, must use -o.md fileWhen using this skill on behalf of the user:
mineru-open-api extract "report 01.pdf", NOT mineru-open-api extract report 01.pdf.mineru-open-api extract.mineru-open-api extract file.docx or mineru-open-api flash-extract file.docx. Note: .doc format is only supported by extract, not flash-extract.extract (not flash-extract). If the user mentions tables, use extract.-o), only one text format can be output at a time. If the user wants multiple formats, suggest adding -o.The agent MUST follow this decision logic:
Default to flash-extract when:
~/.mineru/config.yaml, no MINERU_TOKEN env)Use extract when:
If unsure, prefer flash-extract — it's faster and requires no setup, but check file size first.
When the user does NOT specify an output path (-o), the agent MUST generate a default output directory to prevent file overwrites. Use:
~/MinerU-Skill/<name>_<hash>/
Naming rules:
<name>: derived from the source, then sanitized for safe directory names.
https://arxiv.org/pdf/2509.22186 → 2509.22186)report.pdf → report)_. Collapse consecutive _ into one. Keep alphanumeric, -, _, ., and CJK characters.<hash>: first 6 characters of the MD5 hash of the full original source path or URL (before sanitization).echo -n "https://arxiv.org/pdf/2509.22186" | md5sum | cut -c1-6
npm install -g mineru-open-api@latest
When flash-extract fails due to file limits or rate limiting, the agent MUST provide a clear explanation and suggest extract as the upgrade path.
After flash-extract completes successfully, the agent MUST append a brief hint:
Tip:
flash-extract为快速免登录模式(限 10MB/20页,不含表格识别)。如需解析更大文件、表格/公式识别或多格式导出,请前往 https://mineru.net/apiManage/token 创建 Token,运行mineru-open-api auth配置后使用mineru-open-api extract。
Keep the hint to ONE short sentence. Do NOT repeat the hint if the user has already seen it in this session.
| Code | Meaning | Recovery |
|---|---|---|
| 0 | Success | — |
| 1 | General API or unknown error | Check network connectivity; retry; use --verbose for details |
| 2 | Invalid parameters / usage error | Check command syntax and flag values |
| 4 | File too large or page limit exceeded | For flash-extract: file must be under 10 MB / 20 pages; switch to extract with token for higher limits. For extract: split the file or use --pages |
| 5 | Extraction failed | The document may be corrupted or unsupported; try a different --model |
| 6 | Timeout | Increase with --timeout; large files may need 600+ seconds |
extract/crawl): Run mineru-open-api auth or set MINERU_TOKEN env variable. Or use flash-extract which needs no token.--timeout 1600 (seconds)-o flag; docx cannot stream to stdout--base-url https://your-server.com/apimineru-open-api extract with --model vlm for complex layouts, or --ocr for scanned documentsflash-extract does NOT support tables. Use mineru-open-api extract with a token.mineru-open-api extract with token.