Table Ocr

v0.4.0

OCR and extract tables from scanned PDFs and images using MinerU. Recognizes table structures in image-based documents and converts them to structured Markdo...

0· 225·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mzlzyca/table-ocr.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Table Ocr" (mzlzyca/table-ocr) from ClawHub.
Skill page: https://clawhub.ai/mzlzyca/table-ocr
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: MINERU_TOKEN
Required binaries: mineru-open-api
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install table-ocr

ClawHub CLI

Package manager switcher

npx clawhub@latest install table-ocr
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description request OCR table extraction via MinerU; required binary (mineru-open-api) and the MINERU_TOKEN credential directly match that purpose and are expected.
Instruction Scope
SKILL.md only instructs the agent to run the MinerU CLI (extract, auth) against user-provided files or URLs and to set MINERU_TOKEN. It does not instruct reading unrelated system files or other environment variables.
Install Mechanism
Install options are standard: npm package and a go install from a GitHub repo. Both are typical for distributing a CLI; no arbitrary downloads or extract-from-unknown-URL steps are present.
Credentials
Only MINERU_TOKEN is required and is declared as the primary credential. That is proportional for a CLI that authenticates to a remote MinerU service.
Persistence & Privilege
Skill is not always-enabled and does not request system-wide persistence or access to other skills' configs. Autonomous invocation is allowed (platform default) but not combined with other concerning privileges.
Assessment
This skill appears coherent, but before installing: verify you trust mineru.net and the npm/GitHub packages (review package source if possible), because installing a global CLI runs third-party code on your system. The MINERU_TOKEN permits the MinerU service to process your files — avoid uploading sensitive documents unless you trust the service and its privacy policy. Use a limited/revocable token, rotate it if needed, and prefer running the CLI locally on non-sensitive data if you cannot confirm the provider's practices.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📄 Clawdis
Binsmineru-open-api
EnvMINERU_TOKEN
Primary envMINERU_TOKEN

Install

Install via npm
Bins: mineru-open-api
npm i -g mineru-open-api
Install via go install
Bins: mineru-open-api
latestvk97emww3sseny48ny09xg7hc2n844h2k
225downloads
0stars
6versions
Updated 3w ago
v0.4.0
MIT-0

Table Ocr

Convert and extract content from .pdf / images (.png/.jpg/.jpeg/.webp) using MinerU (mineru-open-api).

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Extract tables from PDF (requires token)
mineru-open-api extract report.pdf -o ./out/

# With explicit table flag and OCR for scanned docs
mineru-open-api extract scanned.pdf --ocr --table -o ./out/

Authentication

Token required for extract and crawl:

mineru-open-api auth            # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supports local files and URLs
  • Requires token (mineru-open-api auth or MINERU_TOKEN env)
  • Supported input: .pdf / images (.png/.jpg/.jpeg/.webp)
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (where applicable)

Notes

  • Table recognition requires extract with token. Use --ocr for scanned content and --table for table detection (both enabled by default in extract).
  • Output goes to stdout by default; use -o <dir> to save to file
  • Binary formats (docx) require -o flag (cannot stream to stdout)
  • All progress/status messages go to stderr
  • MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Comments

Loading comments...