Doc Extract

v0.4.0

Extract text and content from Word documents (.doc, .docx) to Markdown using MinerU. A straightforward tool for reading and extracting Word file content. Fea...

0· 94·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the declared requirements: the skill needs the mineru-open-api CLI and an optional MINERU_TOKEN for full extraction of .doc files, which is coherent with a document-extraction utility.
Instruction Scope
SKILL.md instructs the agent to invoke mineru-open-api commands on local files or URLs and to set MINERU_TOKEN for authenticated operations; it does not request unrelated files, credentials, or system access.
Install Mechanism
Install options are standard package installs (npm or go install) for a named package that produces the expected binary; no arbitrary URL downloads or extract steps are present.
Credentials
Only MINERU_TOKEN is required and is justified by the README: flash-extract on .docx is tokenless while full .doc extraction requires authentication. No unrelated secrets or multiple credentials are requested.
Persistence & Privilege
Skill does not request always:true, does not modify other skills, and has normal autonomous-invocation defaults. It does not request elevated or persistent system privileges.
Scan Findings in Context
[no-findings] expected: No code files present; the regex-based scanner had nothing to analyze. This is expected for an instruction-only skill that delegates work to an external CLI.
Assessment
This skill appears to do what it claims: it invokes the MinerU CLI to extract Word content. Before installing, verify the mineru-open-api npm/go package and the homepage (https://mineru.net) are legitimate and up-to-date. Provide MINERU_TOKEN only if you need full .doc extraction; avoid using a high-privilege or shared token. Remember the CLI will read local files you point it at—do not process sensitive documents unless you trust the installed package and the MinerU service.

Like a lobster shell, security has layers — review code before you run it.

latestvk9721a62qdj1fdyyp9xf1rnetd844kms

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

📄 Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Primary envMINERU_TOKEN

Install

Install via npm
Bins: mineru-open-api
npm i -g mineru-open-api
Install via go install
Bins: mineru-open-api

SKILL.md

Doc Extract

Extract text and content from Word (.doc/.docx) files to Markdown using MinerU.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Quick extraction from .docx (no token required)
mineru-open-api flash-extract report.docx

# Save to directory
mineru-open-api flash-extract report.docx -o ./out/

# Extract .doc file (requires token)
mineru-open-api extract report.doc -o ./out/

# Extract with language hint
mineru-open-api extract report.docx --language en -o ./out/

Authentication

No token needed for flash-extract on .docx. Token required for .doc and extract:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: .doc, .docx (local file or URL)
  • .docx: supports flash-extract (no token, max 10 MB / 20 pages) and extract
  • .doc: requires extract with token
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (e.g. 1-10)

Notes

  • .doc requires extract with token; .docx works with flash-extract for quick extraction
  • Output goes to stdout by default; use -o <dir> to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…