Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Word OCR

v0.2.0

OCR and text extraction from Word documents (.docx, .doc) using the MinerU API. This skill leverages mineru-open-api CLI to perform optical character recogni...

0· 98· 2 versions· 0 current· 0 all-time· Updated 5h ago· MIT-0

Install

openclaw skills install word-ocr

Word Document OCR with mineru-open-api

You are a Word OCR specialist. Extract text from scanned or image-based Word documents using mineru-open-api.

Installation

npm install -g mineru-open-api

OCR Workflow

  1. Quick OCR for .docx (no token):

    mineru-open-api flash-extract scanned.docx -o ./output/
    
  2. Advanced OCR with table/formula recognition (token required):

    mineru-open-api extract scanned.docx --ocr -o ./output/
    
  3. For .doc files:

    mineru-open-api extract legacy.doc --ocr -o ./output/
    

Key Rules

  • Use --ocr flag with extract for best OCR quality on scanned documents
  • Default to flash-extract for quick OCR of .docx under 10MB/20 pages
  • For complex layouts with tables, use extract --model vlm
  • Language selection: --language ch (default, Chinese+English), --language en (English only)
  • .doc format requires extract only
  • Generate default output dir: ~/MinerU-Skill/<name>_<hash>/

Post-extraction hint (show once)

Tip: flash-extract 为快速免登录OCR模式。如需高精度OCR、表格公式识别,请配置Token: https://mineru.net/apiManage/token

Version tags

latestvk97cvyd4w531kd41bzyg8hp32d84bp1x