Doc Analysis

v0.4.0

Analyze the structure, layout, and content of Word documents (.doc, .docx) using MinerU. Returns structured Markdown with headings, paragraphs, tables, and l...

0· 216·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mzlzyca/doc-analysis.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Doc Analysis" (mzlzyca/doc-analysis) from ClawHub.
Skill page: https://clawhub.ai/mzlzyca/doc-analysis
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: MINERU_TOKEN
Required binaries: mineru-open-api
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install doc-analysis

ClawHub CLI

Package manager switcher

npx clawhub@latest install doc-analysis
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (Word document analysis) match the declared binary (mineru-open-api) and the single required env var (MINERU_TOKEN). Requiring a MinerU CLI and token is expected for a hosted/open-source document analysis service.
Instruction Scope
Runtime instructions stick to running the mineru-open-api CLI on local files or URLs and handling stdout/stderr. One minor inconsistency: SKILL.md notes a 'flash-extract' mode that requires no token while metadata marks MINERU_TOKEN as required; this is likely an over-assertion in metadata rather than malicious scope creep.
Install Mechanism
Installers are standard: npm package and a Go 'go install' from a GitHub repo. No download-from-untrusted-URL or archive extraction steps are present. These are moderate-risk (npm/GitHub) but appropriate for a CLI tool.
Credentials
Only one credential is requested (MINERU_TOKEN) which is proportional for a remote MinerU service. The SKILL.md's mention that some quick extraction works without a token suggests the token may not be strictly required for all operations; metadata requiring it unconditionally is slightly overbroad but not a strong red flag.
Persistence & Privilege
always is false, the skill is user-invocable, and it does not request to modify other skills or system-wide configs. It only requires presence of the mineru-open-api binary.
Assessment
This skill appears coherent and does what it claims: it runs the MinerU CLI against .doc/.docx files and uses a MINERU_TOKEN for authenticated extracts. Before installing: (1) Confirm you trust the npm package name and the GitHub repo (inspect the repo/source if you can). (2) Understand that documents processed by the CLI may be sent to MinerU servers when using the authenticated 'extract' mode — avoid sending highly sensitive documents unless you’ve verified the service’s privacy/security. (3) If you only need quick, local/no-token extraction, check whether 'flash-extract' actually operates without a token in your environment. (4) Prefer installing in a sandbox or container first and verify behavior and network activity if you have strict security requirements.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📄 Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Primary envMINERU_TOKEN

Install

Install via npm
Bins: mineru-open-api
npm i -g mineru-open-api
Install via go install
Bins: mineru-open-api
latestvk97dqdt7g58xbmvpt4bme9d939844pmh
216downloads
0stars
6versions
Updated 3w ago
v0.4.0
MIT-0

Doc Analysis

Analyze and extract structured content from Word (.doc/.docx) files using MinerU. Returns Markdown with layout, headings, and structure preserved.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Analyze a .docx file (requires token)
mineru-open-api extract report.docx -o ./out/

# Analyze a .doc file (requires token)
mineru-open-api extract report.doc -o ./out/

# Specify language
mineru-open-api extract report.docx --language en -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: .doc, .docx (local file or URL)
  • Preserves document structure: headings, paragraphs, lists, tables
  • Requires token (mineru-open-api auth or MINERU_TOKEN env)
  • Language hint with --language (default: ch, use en for English)

Notes

  • .doc (legacy Word format) is only supported by extract (requires token)
  • .docx supports both flash-extract (no token, quick) and extract (full features)
  • Output goes to stdout by default; use -o <dir> to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Comments

Loading comments...