mineru-agent-free

v1.0.0

用 MinerU Agent 轻量解析 API 将 PDF/Word/PPT/Excel/图片解析为 Markdown,无需 Token,IP 限频。适用于文档解析、表格提取、OCR 识别。

1· 136·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for haxck/mineru-agent.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "mineru-agent-free" (haxck/mineru-agent) from ClawHub.
Skill page: https://clawhub.ai/haxck/mineru-agent
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install mineru-agent

ClawHub CLI

Package manager switcher

npx clawhub@latest install mineru-agent
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (document→Markdown parsing) matches the included code and SKILL.md. The Python script implements URL-based and file-upload-based parsing against https://mineru.net/api/v1/agent, which is exactly the stated purpose. Minor note: the script depends on Python and the 'requests' library, but the registry metadata did not explicitly declare these runtime dependencies.
Instruction Scope
Runtime instructions and the script are narrowly scoped to submitting documents (either by URL or uploading local files) to MinerU, polling for results, and returning Markdown. They do not attempt to read unrelated system files or environment variables. Important privacy implication: the instructions explicitly upload user files and download parsed Markdown from external URLs—expected for this purpose but potentially sensitive.
Install Mechanism
No install spec is provided (instruction-only plus a Python script). Nothing is downloaded from arbitrary URLs and no extract/install actions are defined. This lowers installation risk. The only operational requirement is that the agent environment has Python and the 'requests' package available.
Credentials
The skill requests no environment variables or credentials, which is proportionate. It does, however, transmit user files over the network to an external service (mineru.net) and to whatever OSS upload URL the service returns—this is necessary for the stated purpose but has privacy/egress implications.
Persistence & Privilege
The skill is not always-enabled and does not request persistent system privileges or modify other skills. It runs only when invoked (user-invocable/autonomous invocation allowed by default).
Assessment
This skill appears to do what it says: it uploads documents to mineru.net and returns Markdown. Before installing/using it, consider: (1) Privacy: any file you submit will be sent to mineru.net and then to the OSS URL the service provides—do not upload sensitive or confidential documents unless you trust MinerU and have reviewed its privacy policy. (2) Dependencies: ensure the runtime has Python 3 and the 'requests' package. (3) Test with non-sensitive files first to confirm behavior and confirm where uploaded files are stored (the service returns the upload URL). (4) Note the documented limits (10MB, 20 pages) and IP rate-limiting. If you need offline or self-hosted document parsing for sensitive data, prefer a local solution instead of this remote API.

Like a lobster shell, security has layers — review code before you run it.

latestvk972fnhpv1sfd18460jhd9dhjx83nhpb
136downloads
1stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

📄 MinerU - 文档解析

PDF/Word/PPT/Excel/图片 → 结构化 Markdown

🎯 触发条件

当用户要求解析文档、提取文档内容、OCR 识别、将文件转为 Markdown 时使用。

🔗 API 信息

📋 支持的文件类型

类型格式
📕 PDF论文、书籍、扫描件
📝 Word.docx
📊 PPT.pptx
📊 Excel.xls, .xlsx
🖼️ 图片.png, .jpg, .jpeg, .jp2, .webp, .gif, .bmp

⚠️ 限制

限制项限制值
文件大小10 MB
文件页数20 页

🚀 使用方式

方式一:URL 解析(文件有公开 URL 时)

直接调用解析脚本:

python3 SKILL_DIR/scripts/mineru_parse.py --url "https://example.com/file.pdf"

可选参数:

  • --language ch|en (默认 ch)
  • --page_range 1-10(仅 PDF 有效)
  • --output /path/to/output.md(指定输出文件)

方式二:文件上传解析(本地文件)

python3 SKILL_DIR/scripts/mineru_parse.py --file /path/to/document.pdf

方式三:在对话中直接使用

用户发送文件或提供文件路径/URL 时,调用脚本解析,将结果返回给用户。

🔄 API 流程

URL 模式

  1. POST /parse/url → 获取 task_id
  2. GET /parse/{task_id} → 轮询直到 done
  3. 下载 markdown_url 返回结果

文件上传模式

  1. POST /parse/file → 获取 task_id + file_url
  2. PUT file_url → 上传文件到 OSS
  3. GET /parse/{task_id} → 轮询直到 done
  4. 下载 markdown_url 返回结果

❌ 错误码

错误码说明应对策略
-30001文件大小超出限制(10MB)拆分文件或告知用户
-30002不支持的文件类型检查文件格式
-30003页数超出限制指定 page_range 拆分
-30004请求参数错误检查必填参数

💡 使用技巧

  1. 中文文档用 language: ch
  2. 大文件指定 page_range 分段解析
  3. Word/PPT 用 Office 原生 API 解析,速度最快
  4. 解析结果为 Markdown 格式,可直接用于后续处理

轻量快速,无需 Token!📄

Comments

Loading comments...