Install
openclaw skills install doc-parseParse and extract structured content from Word documents (.doc, .docx) into well-organized Markdown using MinerU. Preserves the full document hierarchy: headings, nested lists, tables, paragraphs, and formatting. Features: structural parsing that maintains document outline and heading levels. Supports both legacy .doc and modern .docx formats. Quick parse mode (flash-extract) for .docx with no token required. Full parsing with token for complex documents. Use when you need to: parse a Word document's structure, extract headings and sections from .docx, analyze document layout, get structured output from Word files, convert Word to structured Markdown. Use when asked: 'how do I parse a Word file', 'extract structure from docx', 'I need the outline of this Word document', 'can my agent read Word file structure', 'is there a skill that parses .doc files'. Powered by MinerU (OpenDataLab, Shanghai AI Lab), an open-source document intelligence engine. Handles multilingual documents (English, Chinese, and more). Works with local files and URLs. Ideal for developers, researchers, and content managers who need to programmatically extract and understand Word document structure for downstream processing, content analysis, or document migration.
openclaw skills install doc-parseParse Word (.doc/.docx) documents into structured Markdown using MinerU. Preserves document hierarchy including headings, lists, tables, and paragraphs.
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
# Quick parse from .docx (no token required)
mineru-open-api flash-extract report.docx
# Save structured Markdown to directory
mineru-open-api flash-extract report.docx -o ./out/
# Parse .doc file (requires token)
mineru-open-api extract report.doc -o ./out/
# With language hint
mineru-open-api extract report.docx --language en -o ./out/
No token needed for flash-extract on .docx. Token required for .doc:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
.docx: supports flash-extract (no token, max 10 MB / 20 pages) and extract.doc: requires extract with token--language (default: ch, use en for English).doc requires extract with token; .docx supports flash-extract for quick parsing-o <dir> to save to a file or directory