pdf-parser-mineru

v1.0.2

PDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats.

0· 1.5k·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for baokui/pdf-parser-mineru.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "pdf-parser-mineru" (baokui/pdf-parser-mineru) from ClawHub.
Skill page: https://clawhub.ai/baokui/pdf-parser-mineru
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install pdf-parser-mineru

ClawHub CLI

Package manager switcher

npx clawhub@latest install pdf-parser-mineru
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included files: SKILL.md documents running MinerU and the repository provides an install script and a Python wrapper that invokes the mineru CLI. Required capabilities (MinerU installation, Python) are proportional to the stated parsing functionality.
Instruction Scope
Runtime instructions and the Python script stay within the skill's scope: they accept an absolute file path and output directory, run a local mineru CLI process, and read/return generated files. The script sets a couple of local env vars to control device selection for the subprocess but does not read or transmit unrelated system secrets or contact hidden endpoints itself. Note: mineru (the third-party tool) may perform network activity or model downloads — that behavior is external to the skill and should be reviewed if you need offline guarantees.
Install Mechanism
There is no platform install spec in registry metadata, but an included install.sh performs pip and 'uv pip install -U "mineru[all]"'. Installing MinerU via PyPI is expected here; it's a moderate-risk operation (pulling packages from PyPI and possibly downloading models/data at runtime). No obscure URLs, shorteners, or direct archive downloads are used in the provided scripts.
Credentials
The skill requests no environment variables or credentials. The code sets PYTORCH_ENABLE_MPS_FALLBACK and MPS_DEVICE locally for the mineru subprocess (device control only). There are no requests for unrelated secrets or config paths.
Persistence & Privilege
Skill flags are standard (always: false, agent invocation allowed). The package does not request permanent system changes or modify other skills' configs. install.sh and the Python script only install MinerU and run it; they do not attempt to persist credentials or enable automatic always-on behavior.
Assessment
This skill is internally coherent: it installs MinerU and runs the mineru CLI to convert PDFs to Markdown/JSON. Before installing, consider the following: (1) mineru is a third-party PyPI package — review its project page and dependencies and prefer installation into an isolated virtual environment or container; (2) MinerU may download models or contact network endpoints at install/time or runtime — if you need offline/sandboxed processing, verify or block network access; (3) the skill requires absolute file paths and can read any PDF you point it at — avoid supplying sensitive documents to untrusted third-party binaries; (4) the included install.sh is safe-looking but will run pip installs and assumes Python 3.10–3.13; run it manually rather than automatically if you want to inspect it first. If you want stronger assurance, review the mineru package source and any model download behavior before use.

Like a lobster shell, security has layers — review code before you run it.

latestvk97fj94edr2v3jpvk9ajgzhe6x80zd8b
1.5kdownloads
0stars
3versions
Updated 2mo ago
v1.0.2
MIT-0

Tool List

1. pdf_to_markdown

Convert PDF documents to Markdown format, preserving document structure, formulas, tables, and images.

Description: Use MinerU to parse PDF documents and output in Markdown format, supporting OCR, formula recognition, table extraction, and other features.

Parameters:

  • file_path (string, required): Absolute path to the PDF file
  • output_dir (string, required): Absolute path to the output directory
  • backend (string, optional): Parsing backend, options: hybrid-auto-engine (default), pipeline, vlm-auto-engine
  • language (string, optional): OCR language code, such as en (English), ch (Chinese), ja (Japanese), etc., defaults to auto-detection
  • enable_formula (boolean, optional): Whether to enable formula recognition, defaults to true
  • enable_table (boolean, optional): Whether to enable table extraction, defaults to true
  • start_page (integer, optional): Start page number (starting from 0), defaults to 0
  • end_page (integer, optional): End page number (starting from 0), defaults to -1 meaning parse all pages

Return Value:

{
  "success": true,
  "output_path": "/path/to/output",
  "markdown_content": "Converted Markdown content...",
  "images": ["List of image paths"],
  "tables": ["List of table information"],
  "formula_count": 10
}

Examples:

python .claude/skills/pdf-process/script/pdf_parser.py \
  '{"name": "pdf_to_markdown", "arguments": {"file_path": "/path/to/document.pdf", "output_dir": "/path/to/output"}}'

# Use specific backend
python .claude/skills/pdf-process/script/pdf_parser.py \
  '{"name": "pdf_to_markdown", "arguments": {"file_path": "/path/to/document.pdf", "output_dir": "/path/to/output", "backend": "pipeline"}}'

# Parse specific pages
python .claude/skills/pdf-process/script/pdf_parser.py \
  '{"name": "pdf_to_markdown", "arguments": {"file_path": "/path/to/document.pdf", "output_dir": "/path/to/output", "start_page": 0, "end_page": 5}}'

2. pdf_to_json

Convert PDF documents to JSON format, including detailed layout and structural information.

Description: Use MinerU to parse PDF documents and output in JSON format, containing structured information such as text blocks, images, tables, formulas, etc.

Parameters:

  • file_path (string, required): Absolute path to the PDF file
  • output_dir (string, required): Absolute path to the output directory
  • backend (string, optional): Parsing backend, options: hybrid-auto-engine (default), pipeline, vlm-auto-engine
  • language (string, optional): OCR language code, such as en (English), ch (Chinese), ja (Japanese), etc., defaults to auto-detection
  • enable_formula (boolean, optional): Whether to enable formula recognition, defaults to true
  • enable_table (boolean, optional): Whether to enable table extraction, defaults to true
  • start_page (integer, optional): Start page number (starting from 0), defaults to 0
  • end_page (integer, optional): End page number (starting from 0), defaults to -1 meaning parse all pages

Return Value:

{
  "success": true,
  "output_path": "/path/to/output.json",
  "pages": [
    {
      "page_no": 0,
      "page_size": [595, 842],
      "blocks": [
        {
          "type": "text",
          "text": "Text content",
          "bbox": [x, y, x, y]
        }
      ],
      "images": [],
      "tables": [],
      "formulas": []
    }
  ],
  "metadata": {
    "total_pages": 10,
    "author": "Author",
    "title": "Title"
  }
}

Examples:

python .claude/skills/pdf-process/script/pdf_parser.py \
  '{"name": "pdf_to_json", "arguments": {"file_path": "/path/to/document.pdf", "output_dir": "/path/to/output"}}'

# Use specific backend and language
python .claude/skills/pdf-process/script/pdf_parser.py \
  '{"name": "pdf_to_json", "arguments": {"file_path": "/path/to/document.pdf", "output_dir": "/path/to/output", "backend": "hybrid-auto-engine", "language": "ch"}}'

Installation Instructions

1. Install MinerU

# Update pip and install uv
pip install --upgrade pip
pip install uv

# Install MinerU (including all features)
uv pip install -U "mineru[all]"

2. Verify Installation

# Check if MinerU is installed successfully
mineru --version

# Test basic functionality
mineru --help

3. System Requirements

  • Python Version: 3.10-3.13
  • Operating System: Linux / Windows / macOS 14.0+
  • Memory:
    • Using pipeline backend: minimum 16GB, recommended 32GB+
    • Using hybrid/vlm backend: minimum 16GB, recommended 32GB+
  • Disk Space: minimum 20GB (SSD recommended)
  • GPU (optional):
    • pipeline backend: supports CPU-only
    • hybrid/vlm backend: requires NVIDIA GPU (Volta architecture and above) or Apple Silicon

Use Cases

  1. Academic Paper Parsing: Extract structured content such as formulas, tables, and images
  2. Technical Document Conversion: Convert PDF documents to Markdown for version control and online publishing
  3. OCR Processing: Process scanned PDFs and garbled PDFs
  4. Multilingual Documents: Supports OCR recognition for 109 languages
  5. Batch Processing: Batch convert multiple PDF documents

Backend Selection Recommendations

  • hybrid-auto-engine (default): Balanced accuracy and speed, suitable for most scenarios
  • pipeline: Suitable for CPU-only environments, best compatibility
  • vlm-auto-engine: Highest accuracy, requires GPU acceleration

Notes

  1. File Paths: All paths must be absolute paths
  2. Output Directory: Non-existent directories will be created automatically
  3. Performance: Using GPU can significantly improve parsing speed
  4. Page Numbers: Page numbers start counting from 0
  5. Memory: Processing large documents may consume more memory

Troubleshooting

Common Issues

  1. Installation Failure:

    • Ensure using Python 3.10-3.13
    • Windows only supports Python 3.10-3.12 (ray does not support 3.13)
    • Using uv pip install can resolve most dependency conflicts
  2. Insufficient Memory:

    • Use pipeline backend
    • Limit parsing pages: start_page and end_page
    • Reduce virtual memory allocation
  3. Slow Parsing Speed:

    • Enable GPU acceleration
    • Use hybrid-auto-engine backend
    • Disable unnecessary features (formulas, tables)
  4. Low OCR Accuracy:

    • Specify the correct document language
    • Ensure the backend supports OCR (use pipeline or hybrid-*)

Related Resources

Comments

Loading comments...